Skip to main content

Python Machine Learning Notebooks (Tutorial style)

Python Machine Learning Notebooks (Tutorial style)

Dr. Tirthajyoti Sarkar, Sunnyvale, CA (You can connect with me on LinkedIn here)

Essential codes/demo IPython notebooks for jump-starting machine learning/data science.
You can start with this article that I wrote in Heartbeat magazine (on Medium platform):

"Some Essential Hacks and Tricks for Machine Learning with Python"

Essential tutorial-type notebooks on Pandas and Numpy

Jupyter notebooks covering a wide range of functions and operations on the topics of NumPy, Pandans, Seaborn, matplotlib etc.

Tutorial-type notebooks covering regression, classification, clustering, dimensionality reduction, and some basic neural network algorithms

Regression

  • Simple linear regression with t-statistic generation
  • Polynomial regression with how to use scikit-learn pipeline feature (check the article I wrote on Towards Data Science)
  • Decision trees and Random Forest regression (showing how the Random Forest works as a robust/regularized meta-estimator rejecting overfitting)

Classification

  • Logistic regression/classification
  • Naive Bayes classification

Clustering

  • K-means clustering
  • Affinity propagation (showing its time complexity and the effect of damping factor)
  • Mean-shift technique (showing its time complexity and the effect of noise on cluster discovery)
  • DBSCAN (showing how it can generically detect areas of high density irrespective of cluster shapes, which the k-means fails to do)
  • Hierarchical clustering with Dendograms showing how to choose optimal number of clusters

Dimensionality reduction

  • Principal component analysis

Deep Learning/Neural Network


Random data generation using symbolic expressions

  • How to use Sympy package to generate random datasets using symbolic mathematical expressions.

Comments

Popular posts from this blog

AutoML-Papers

Awesome-AutoML-Papers A curated list of automated machine learning papers, articles, tutorials, slides and projects. Introduction to AutoML Machine learning (ML) has achieved considerable successes in recent years and an ever-growing number of disciplines rely on it. However, this success crucially relies on human machine learning experts to perform the following tasks: Preprocess the data Select appropriate features Select an appropriate model family Optimize model hyperparameters Postprocess machine learning models Critically analyze the results obtained. As the complexity of these tasks is often beyond non-ML-experts, the rapid growth of machine learning applications has created a demand for off-the-shelf machine learning methods that can be used easily and without expert knowledge. We call the resulting research area that targets progressive automation of machine learning  AutoML . AutoML draws on many disciplines of machine learning, prominently including ...

Deep Learning Project

An end to end tutorial of a machine learning pipeline This tutorial tries to do what most Most Machine Learning tutorials available online do not. It is not a 30 minute tutorial which teaches you how to "Train your own neural network" or "Learn deep learning in under 30 minutes". It's a full pipeline which you would need to do if you actually work with machine learning - introducing you to all the parts, and all the implementation decisions and details that need to be made. The dataset is not one of the standard sets like MNIST or CIFAR, you will make you very own dataset. Then you will go through a couple conventional machine learning algorithms, before finally getting to deep learning! In the fall of 2016, I was a Teaching Fellow (Harvard's version of TA) for the graduate class on "Advanced Topics in Data Science (CS209/109)" at Harvard University. I was in-charge of designing the class project given to the students, and this tutorial has...