Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rasbt/pattern_classification
A collection of tutorials and examples for solving and understanding machine learning and pattern classification tasks
https://github.com/rasbt/pattern_classification
machine-learning machine-learning-algorithms pattern-classification
Last synced: 5 days ago
JSON representation
A collection of tutorials and examples for solving and understanding machine learning and pattern classification tasks
- Host: GitHub
- URL: https://github.com/rasbt/pattern_classification
- Owner: rasbt
- License: gpl-3.0
- Created: 2014-03-30T05:34:34.000Z (almost 11 years ago)
- Default Branch: master
- Last Pushed: 2023-11-26T15:54:59.000Z (about 1 year ago)
- Last Synced: 2025-01-16T21:03:27.883Z (12 days ago)
- Topics: machine-learning, machine-learning-algorithms, pattern-classification
- Language: Jupyter Notebook
- Homepage:
- Size: 108 MB
- Stars: 4,160
- Watchers: 387
- Forks: 1,285
- Open Issues: 0
-
Metadata Files:
- Readme: README.ipynb
- License: LICENSE
Awesome Lists containing this project
- awesome-data-science-resources - Pattern Classification
- awesome-data-science-resources - Pattern Classification
- fucking-lists - pattern_classification
- awesomelist - pattern_classification
- collection - pattern_classification
- lists - pattern_classification
- awesome-starred - rasbt/pattern_classification - A collection of tutorials and examples for solving and understanding machine learning and pattern classification tasks (machine-learning)
README
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![logo](./Images/logo.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n",
"**Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining.**\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Table of Contents"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- [Introduction to Machine Learning and Pattern Classification](#Introduction-to-Machine-Learning-and-Pattern-Classification)\n",
"- [Pre-Processing](#Pre-Processing)\n",
"- [Model Evaluation](#Model-Evaluation)\n",
"- [Parameter Estimation](#Parameter-Estimation)\n",
"- [Machine Learning Algorithms](#Machine-Learning-Algorithms)\n",
"\t- [Bayes Classification](#Bayes-Classification)\n",
"\t- [Logistic Regression](#Logistic-Regression)\n",
"\t- [Neural Networks](#Neural-Networks)\n",
"\t- [Ensemble Methods](#Ensemble-Methods)\n",
"- [Statistical Pattern Classification Examples](#Statistical-Pattern-Classification-Examples)\n",
"- [Clustering](#Clustering)\n",
"- [Collecting Data](#Collecting-Data)\n",
"- [Resources](#Resources)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introduction to Machine Learning and Pattern Classification"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Predictive modeling, supervised machine learning, and pattern classification - the big picture [[Markdown]](machine_learning/supervised_intro/introduction_to_supervised_machine_learning.md)\n",
"* Entry Point: Data - Using Python's sci-packages to prepare data for Machine Learning tasks and other data analyses [[IPython nb]](machine_learning/scikit-learn/python_data_entry_point.ipynb)\n",
"* An Introduction to simple linear supervised classification using `scikit-learn` [[IPython nb]](machine_learning/scikit-learn/scikit_linear_classification.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Pre-Processing"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* **Feature Extraction**\n",
" * Tips and Tricks for Encoding Categorical Features in Classification Tasks [[IPython nb]](preprocessing/feature_encoding.ipynb)\n",
"* **Scaling and Normalization**\n",
" * About Feature Scaling: Standardization and Min-Max-Scaling (Normalization) [[IPython nb]](preprocessing/about_standardization_normalization.ipynb)\n",
"* **Feature Selection**\n",
" * Sequential Feature Selection Algorithms [[IPython nb]](dimensionality_reduction/feature_selection/sequential_selection_algorithms.ipynb)\n",
"* **Dimensionality Reduction**\n",
" * Principal Component Analysis (PCA) [[IPython nb]](dimensionality_reduction/projection/principal_component_analysis.ipynb)\n",
" * PCA based on the covariance vs. correlation matrix [[IPython nb]](dimensionality_reduction/projection/pca_cov_cor.ipynb)\n",
" * Linear Discriminant Analysis (LDA) [[IPython nb]](dimensionality_reduction/projection/linear_discriminant_analysis.ipynb)\n",
" * The effect of scaling and mean centering of variables prior to a PCA [[PDF]](./dimensionality_reduction/projection/scale_center_pca/scale_center_pca.pdf) \n",
" * Kernel tricks and nonlinear dimensionality reduction via PCA [[IPython nb]](dimensionality_reduction/projection/kernel_pca.ipynb)\n",
"* **Representing Text**\n",
"\t* Tf-idf Walkthrough for scikit-learn [[IPython nb](./machine_learning/scikit-learn/tfidf_scikit-learn.ipynb)] "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Model Evaluation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* An Overview of General Performance Metrics of Binary Classifier Systems [[PDF](./evaluation/performance_metrics/performance_metrics.pdf)]\n",
"* **Cross-Validation**\n",
" * Streamline your cross-validation workflow - scikit-learn's Pipeline in action [[IPython nb]](machine_learning/scikit-learn/scikit-pipeline.ipynb)\n",
"* Model evaluation, model selection, and algorithm selection in machine learning - Part I [[Markdown]](./evaluation/model-evaluation/model-evaluation-selection-part1.md)\n",
"* Model evaluation, model selection, and algorithm selection in machine learning - Part II [[Markdown]](./evaluation/model-evaluation/model-evaluation-selection-part2.md)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Parameter Estimation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* **Parametric Techniques**\n",
" * Introduction to the Maximum Likelihood Estimate (MLE) [[IPython nb]](parameter_estimation_techniques/maximum_likelihood_estimate.ipynb)\n",
" * How to calculate Maximum Likelihood Estimates (MLE) for different distributions [[IPython nb]](parameter_estimation_techniques/max_likelihood_est_distributions.ipynb)\n",
"\n",
"* **Non-Parametric Techniques**\n",
" * Kernel density estimation via the Parzen-window technique [[IPython nb]](parameter_estimation_techniques/parzen_window_technique.ipynb)\n",
" * The K-Nearest Neighbor (KNN) technique\n",
"\n",
"* **Regression Analysis**\n",
" * Linear Regression\n",
" * Least-Squares fit [[IPython nb]](data_fitting/regression/linregr_least_squares_fit.ipynb)\n",
" * Non-Linear Regression"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Machine Learning Algorithms"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Bayes Classification\n",
"\n",
"- Naive Bayes and Text Classification I - Introduction and Theory [[View PDF](http://sebastianraschka.com/PDFs/articles/naive_bayes_1.pdf)] [[Download PDF](./machine_learning/naive_bayes_1/tex/naive_bayes_1.pdf)] \n",
"\n",
"#### Logistic Regression\n",
"\n",
"- Out-of-core Learning and Model Persistence using scikit-learn\n",
"[[IPython nb](./machine_learning/scikit-learn/outofcore_modelpersistence.ipynb)]\n",
"\n",
"#### Neural Networks\n",
"\n",
"- Artificial Neurons and Single-Layer Neural Networks - How Machine Learning Algorithms Work Part 1 [[IPython nb](./machine_learning/singlelayer_neural_networks/singlelayer_neural_networks.ipynb)]\n",
"\n",
"- Activation Function Cheatsheet [[IPython nb](./machine_learning/neural_networks/ipynb/activation_functions.ipynb)]\n",
"\n",
"#### Ensemble Methods\n",
"\n",
"- Implementing a Weighted Majority Rule Ensemble Classifier in scikit-learn [[IPython nb](./machine_learning/scikit-learn/ensemble_classifier.ipynb)]\n",
"\n",
"#### Decision Trees\n",
"\n",
"- Cheatsheet for Decision Tree Classification [[IPython nb]('./machine_learning/decision_trees/decision-tree-cheatsheet.ipynb')]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Clustering"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- **Protoype-based clustering**\n",
"- **Hierarchical clustering**\n",
"\t- Complete-Linkage Clustering and Heatmaps in Python [[IPython nb](./clustering/hierarchical/clust_complete_linkage.ipynb)]\n",
"- **Density-based clustering**\n",
"- **Graph-based clustering**\n",
"- **Probabilistic-based clustering**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Collecting Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Collecting Fantasy Soccer Data with Python and Beautiful Soup [[IPython nb](./data_collecting/parse_dreamteamfc_data.ipynb)]\n",
"\n",
"- Download Your Twitter Timeline and Turn into a Word Cloud Using Python [[IPython nb](./data_collecting/twitter_wordcloud.ipynb)]\n",
"\n",
"- Reading MNIST into NumPy arrays [[IPython nb](./data_collecting/reading_mnist.ipynb)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Statistical Pattern Classification Examples"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* **Supervised Learning**\n",
" * Parametric Techniques\n",
" * Univariate Normal Density\n",
" * Ex1: 2-classes, equal variances, equal priors [[IPython nb]](stat_pattern_class/supervised/parametric/1_stat_superv_parametric.ipynb)\n",
" * Ex2: 2-classes, different variances, equal priors [[IPython nb]](stat_pattern_class/supervised/parametric/2_stat_superv_parametric.ipynb)\n",
" * Ex3: 2-classes, equal variances, different priors [[IPython nb]](stat_pattern_class/supervised/parametric/3_stat_superv_parametric.ipynb)\n",
" * Ex4: 2-classes, different variances, different priors, loss function [[IPython nb]](stat_pattern_class/supervised/parametric/4_stat_superv_parametric.ipynb)\n",
" * Ex5: 2-classes, different variances, equal priors, loss function, cauchy distr.[[IPython nb]](stat_pattern_class/supervised/parametric/5_stat_superv_parametric.ipynb)\n",
"\n",
" * Multivariate Normal Density\n",
" * Ex5: 2-classes, different variances, equal priors, loss function [[IPython nb]](stat_pattern_class/supervised/parametric/5_stat_superv_parametric.ipynb)\n",
" * Ex7: 2-classes, equal variances, equal priors [[IPython nb]](stat_pattern_class/supervised/parametric/7_stat_superv_parametric.ipynb)\n",
"\n",
" * Non-Parametric Techniques"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Resources"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Matplotlib examples - Visualization techniques for exploratory data analysis [[IPython nb]](resources/matplotlib_viz_gallery.ipynb)\n",
"\n",
"* Copy-and-paste ready LaTex equations [[Markdown]](resources/latex_equations.md)\n",
"\n",
"* Open-source datasets [[Markdown]](resources/dataset_collections.md)\n",
"\n",
"* Free Machine Learning eBooks [[Markdown]](resources/machine_learning_ebooks.md)\n",
"\n",
"* Terms in data science defined in less than 50 words [[Markdown]](resources/data_glossary.md)\n",
"\n",
"* Useful libraries for data science in Python [[Markdown]](resources/python_data_libraries.md)\n",
"\n",
"* General Tips and Advices [[Markdown]](resources/general_tips_and_advices.md)\n",
"\n",
"* A matrix cheatsheat for Python, R, Julia, and MATLAB [[HTML]](http://sebastianraschka.com/github/pattern_classification/matrix_cheatsheet_table.html)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 0
}