Skip to main content

Pattern Classification - Machine Learning Tutorial

logo

**Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining.** 



Sections



[Download a PDF version] of this flowchart.





Introduction to Machine Learning and Pattern Classification

  • Predictive modeling, supervised machine learning, and pattern classification - the big picture [Markdown]
  • Entry Point: Data - Using Python's sci-packages to prepare data for Machine Learning tasks and other data analyses [IPython nb]
  • An Introduction to simple linear supervised classification using scikit-learn [IPython nb]





Pre-processing

  • Feature Extraction
    • Tips and Tricks for Encoding Categorical Features in Classification Tasks [IPython nb]
  • Scaling and Normalization
    • About Feature Scaling: Standardization and Min-Max-Scaling (Normalization) [IPython nb]
  • Feature Selection
    • Sequential Feature Selection Algorithms [IPython nb]
  • Dimensionality Reduction
    • Principal Component Analysis (PCA) [IPython nb]
    • The effect of scaling and mean centering of variables prior to a PCA [PDF] [HTML]
    • PCA based on the covariance vs. correlation matrix [IPython nb]
    • Linear Discriminant Analysis (LDA) [IPython nb]
      • Kernel tricks and nonlinear dimensionality reduction via PCA [IPython nb]
  • Representing Text
    • Tf-idf Walkthrough for scikit-learn [IPython nb]



Model Evaluation

  • An Overview of General Performance Metrics of Binary Classifier Systems [PDF]
  • Cross-validation
    • Streamline your cross-validation workflow - scikit-learn's Pipeline in action [IPython nb]
  • Model evaluation, model selection, and algorithm selection in machine learning - Part I [Markdown]
  • Model evaluation, model selection, and algorithm selection in machine learning - Part II [Markdown]



Parameter Estimation

  • Parametric Techniques
    • Introduction to the Maximum Likelihood Estimate (MLE) [IPython nb]
    • How to calculate Maximum Likelihood Estimates (MLE) for different distributions [IPython nb]
  • Non-Parametric Techniques
    • Kernel density estimation via the Parzen-window technique [IPython nb]
    • The K-Nearest Neighbor (KNN) technique
  • Regression Analysis
    • Linear Regression
    • Non-Linear Regression



Machine Learning Algorithms

Bayes Classification

  • Naive Bayes and Text Classification I - Introduction and Theory [PDF]

Logistic Regression

  • Out-of-core Learning and Model Persistence using scikit-learn [IPython nb]

Neural Networks

  • Artificial Neurons and Single-Layer Neural Networks - How Machine Learning Algorithms Work Part 1 [IPython nb]
  • Activation Function Cheatsheet [IPython nb]

Ensemble Methods

  • Implementing a Weighted Majority Rule Ensemble Classifier in scikit-learn [IPython nb]

Decision Trees

  • Cheatsheet for Decision Tree Classification [IPython nb]



Clustering

  • Protoype-based clustering
  • Hierarchical clustering
    • Complete-Linkage Clustering and Heatmaps in Python [IPython nb]
  • Density-based clustering
  • Graph-based clustering
  • Probabilistic-based clustering



Collecting Data

  • Collecting Fantasy Soccer Data with Python and Beautiful Soup [IPython nb]
  • Download Your Twitter Timeline and Turn into a Word Cloud Using Python [IPython nb]
  • Reading MNIST into NumPy arrays [IPython nb]



Data Visualization

  • Exploratory Analysis of the Star Wars API [IPython nb]
  • Matplotlib examples -Exploratory data analysis of the Iris dataset [IPython nb]
  • Artificial Intelligence publications per country


Statistical Pattern Classification Examples

  • Supervised Learning
    • Parametric Techniques
      • Univariate Normal Density
        • Ex1: 2-classes, equal variances, equal priors [IPython nb]
        • Ex2: 2-classes, different variances, equal priors [IPython nb]
        • Ex3: 2-classes, equal variances, different priors [IPython nb]
        • Ex4: 2-classes, different variances, different priors, loss function [IPython nb]
        • Ex5: 2-classes, different variances, equal priors, loss function, cauchy distr. [IPython nb]
      • Multivariate Normal Density
        • Ex5: 2-classes, different variances, equal priors, loss function [IPython nb]
        • Ex7: 2-classes, equal variances, equal priors [IPython nb]
    • Non-Parametric Techniques



Books

Python Machine Learning




Talks

An Introduction to Supervised Machine Learning and Pattern Classification: The Big Picture



MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics




Applications

MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics

This project is about building a music recommendation system for users who want to listen to happy songs. Such a system can not only be used to brighten up one's mood on a rainy weekend; especially in hospitals, other medical clinics, or public locations such as restaurants, the MusicMood classifier could be used to spread positive mood among people.

mlxtend - A library of extension and helper modules for Python's data analysis and machine learning libraries.




Resources

  • Copy-and-paste ready LaTex equations [Markdown]
  • Open-source datasets [Markdown]
  • Free Machine Learning eBooks [Markdown]
  • Terms in data science defined in less than 50 words [Markdown]
  • Useful libraries for data science in Python [Markdown]
  • General Tips and Advices [Markdown]
  • A matrix cheatsheat for Python, R, Julia, and MATLAB [HTML]

Comments

Popular posts from this blog

Introduction to Machine Learning in Python

Python tutorials for introduction to machine learning Introduction to Machine Learning in Python This repository provides instructional material for machine learning in python. The material is used for two classes taught at NYU Tandon by  Sundeep Rangan : EE-UY / CS-UY 4563: Introduction to Machine Learning (Undergraduate) EL-GY 6123: Introduction to Machine Learning (Graduate) Anyone is free to use and copy this material (at their own risk!). But, please cite the material if you use the material in your own class. Pre-requisites All the software can be run on any laptop (Windows, MAC or UNIX).  Instructions  are also provided to run the code in Google Cloud Platform on a virtual machine (VM). Both classes assume no python or ML experience. However, experience with some programming language (preferably object-oriented) is required. To follow all the mathematical details and to complete the homework exercises, the class assumes undergraduate probability, ...

Python Machine Learning Notebooks (Tutorial style)

Python Machine Learning Notebooks (Tutorial style) Dr. Tirthajyoti Sarkar, Sunnyvale, CA ( You can connect with me on LinkedIn here ) Essential codes/demo IPython notebooks for jump-starting machine learning/data science. You can start with this article that I wrote in Heartbeat magazine (on Medium platform): "Some Essential Hacks and Tricks for Machine Learning with Python" Essential tutorial-type notebooks on Pandas and Numpy Jupyter notebooks covering a wide range of functions and operations on the topics of NumPy, Pandans, Seaborn, matplotlib etc. Basics of Numpy array Basics of Pandas DataFrame Basics of Matplotlib and Descriptive Statistics Tutorial-type notebooks covering regression, classification, clustering, dimensionality reduction, and some basic neural network algorithms Regression Simple linear regression with t-statistic generation Multiple ways to do linear regression in Python and their speed comparison ( check the article I wr...

R tutorials for Data Science, NLP and Machine Learning

R Data Science Tutorials This repo contains a curated list of R tutorials and packages for Data Science, NLP and Machine Learning. This also serves as a reference guide for several common data analysis tasks. Curated list of Python tutorials for Data Science, NLP and Machine Learning . Comprehensive topic-wise list of Machine Learning and Deep Learning tutorials, codes, articles and other resources . Learning R Online Courses tryR on Codeschool Introduction to R for Data Science - Microsoft | edX Introduction to R on DataCamp Data Analysis with R Free resources for learning R R for Data Science - Hadley Wickham Advanced R - Hadley Wickham swirl: Learn R, in R Data Analysis and Visualization Using R MANY R PROGRAMMING TUTORIALS A Handbook of Statistical Analyses Using R , Find Other Chapters Cookbook for R Learning R in 7 simple steps More Resources Awesome-R Repository on GitHub R Reference Card: Cheatsheet R bloggers: blog aggregator R Resources...