An open API service indexing awesome lists of open source software.

https://github.com/awesomelistsio/awesome-machine-learning

A curated list of awesome frameworks, libraries, tools, tutorials, datasets, and research papers in machine learning. This list covers a wide array of topics, from foundational algorithms to modern techniques in supervised, unsupervised, and reinforcement learning.
https://github.com/awesomelistsio/awesome-machine-learning

List: awesome-machine-learning

awesome awesome-list awesome-lists machine-learning

Last synced: 4 months ago
JSON representation

A curated list of awesome frameworks, libraries, tools, tutorials, datasets, and research papers in machine learning. This list covers a wide array of topics, from foundational algorithms to modern techniques in supervised, unsupervised, and reinforcement learning.

Awesome Lists containing this project

README

        

# Awesome Machine Learning [![Awesome Lists](https://srv-cdn.himpfen.io/badges/awesome-lists/awesomelists-flat.svg)](https://github.com/awesomelistsio/awesome)

[![Buy Me A Coffee](https://srv-cdn.himpfen.io/badges/buymeacoffee/buymeacoffee-flat.svg)](https://tinyurl.com/2h9aktmd)   [![Ko-Fi](https://srv-cdn.himpfen.io/badges/kofi/kofi-flat.svg)](https://tinyurl.com/d4xnrptz)   [![PayPal](https://srv-cdn.himpfen.io/badges/paypal/paypal-flat.svg)](https://tinyurl.com/mr22naua)   [![Stripe](https://srv-cdn.himpfen.io/badges/stripe/stripe-flat.svg)](https://tinyurl.com/e8ymxdw3)

> A curated list of awesome frameworks, libraries, tools, tutorials, datasets, and research papers in machine learning. This list covers a wide array of topics, from foundational algorithms to modern techniques in supervised, unsupervised, and reinforcement learning.

## Contents

- [Frameworks and Libraries](#frameworks-and-libraries)
- [Tools and Utilities](#tools-and-utilities)
- [Algorithms and Techniques](#algorithms-and-techniques)
- [Model Evaluation and Tuning](#model-evaluation-and-tuning)
- [Feature Engineering](#feature-engineering)
- [Supervised Learning](#supervised-learning)
- [Unsupervised Learning](#unsupervised-learning)
- [Reinforcement Learning](#reinforcement-learning)
- [Datasets](#datasets)
- [Research Papers](#research-papers)
- [Learning Resources](#learning-resources)
- [Books](#books)
- [Community](#community)
- [Contribute](#contribute)
- [License](#license)

## Frameworks and Libraries

- [Scikit-learn](https://scikit-learn.org/stable/) - A comprehensive Python library for machine learning with efficient tools for data analysis.
- [TensorFlow](https://www.tensorflow.org/) - An open-source platform for machine learning and deep learning by Google.
- [PyTorch](https://pytorch.org/) - An open-source machine learning framework popular for its dynamic computation graph.
- [XGBoost](https://xgboost.ai/) - A scalable, efficient, and widely-used gradient boosting library.
- [LightGBM](https://lightgbm.readthedocs.io/) - A fast, distributed, high-performance gradient boosting framework.
- [CatBoost](https://catboost.ai/) - A gradient boosting library with built-in support for categorical features.

## Tools and Utilities

- [MLflow](https://mlflow.org/) - An open-source platform for managing the end-to-end machine learning lifecycle.
- [Weights & Biases](https://www.wandb.com/) - A tool for experiment tracking, model monitoring, and hyperparameter optimization.
- [DVC (Data Version Control)](https://dvc.org/) - A version control system for machine learning projects.
- [Optuna](https://optuna.org/) - An automatic hyperparameter optimization framework.
- [Streamlit](https://streamlit.io/) - A library for creating interactive machine learning web apps quickly.

## Algorithms and Techniques

- [Linear Regression](https://en.wikipedia.org/wiki/Linear_regression) - A simple, yet powerful, supervised learning algorithm for regression tasks.
- [Logistic Regression](https://en.wikipedia.org/wiki/Logistic_regression) - A classification algorithm based on the logistic function.
- [Decision Trees](https://en.wikipedia.org/wiki/Decision_tree_learning) - A non-parametric supervised learning algorithm used for classification and regression tasks.
- [Random Forest](https://link.springer.com/article/10.1023/A:1010933404324) - An ensemble learning method using multiple decision trees.
- [Gradient Boosting](https://en.wikipedia.org/wiki/Gradient_boosting) - A technique for building predictive models through an ensemble of weak learners.

## Model Evaluation and Tuning

- [Cross-Validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)) - A statistical method used to estimate the performance of a model.
- [Confusion Matrix](https://en.wikipedia.org/wiki/Confusion_matrix) - A tool for evaluating the performance of classification algorithms.
- [Precision, Recall, F1 Score](https://en.wikipedia.org/wiki/Precision_and_recall) - Metrics for evaluating the accuracy of a classification model.
- [Grid Search](https://scikit-learn.org/stable/modules/grid_search.html) - A method for hyperparameter optimization through exhaustive search.
- [Bayesian Optimization](https://arxiv.org/abs/1206.2944) - A method for optimizing hyperparameters using probabilistic models.

## Feature Engineering

- [Pandas](https://pandas.pydata.org/) - A Python library for data manipulation and analysis.
- [FeatureTools](https://www.featuretools.com/) - An open-source library for automated feature engineering.
- [Missingno](https://github.com/ResidentMario/missingno) - A Python library for visualizing missing data.
- [Category Encoders](https://contrib.scikit-learn.org/category_encoders/) - A collection of scikit-learn compatible transformers for encoding categorical features.
- [Principal Component Analysis (PCA)](https://en.wikipedia.org/wiki/Principal_component_analysis) - A technique for dimensionality reduction.

## Supervised Learning

- [Support Vector Machines (SVM)](https://en.wikipedia.org/wiki/Support_vector_machine) - A powerful algorithm for classification tasks.
- [K-Nearest Neighbors (KNN)](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) - A simple, instance-based learning algorithm.
- [Naive Bayes](https://en.wikipedia.org/wiki/Naive_Bayes_classifier) - A family of probabilistic classifiers based on Bayes' theorem.
- [Ensemble Methods](https://en.wikipedia.org/wiki/Ensemble_learning) - Techniques like bagging and boosting for improving model accuracy.
- [Neural Networks](https://en.wikipedia.org/wiki/Artificial_neural_network) - A class of models inspired by the human brain's structure.

## Unsupervised Learning

- [K-Means Clustering](https://en.wikipedia.org/wiki/K-means_clustering) - A popular clustering algorithm for partitioning data into K clusters.
- [Hierarchical Clustering](https://en.wikipedia.org/wiki/Hierarchical_clustering) - A method of cluster analysis that builds a hierarchy of clusters.
- [DBSCAN (Density-Based Spatial Clustering)](https://en.wikipedia.org/wiki/DBSCAN) - A clustering algorithm that identifies dense regions of data points.
- [Gaussian Mixture Models (GMM)](https://en.wikipedia.org/wiki/Mixture_model) - A probabilistic model for representing normally distributed subpopulations within an overall population.
- [Dimensionality Reduction](https://en.wikipedia.org/wiki/Dimensionality_reduction) - Techniques like PCA and t-SNE for reducing the number of features.

## Reinforcement Learning

- [Q-Learning](https://en.wikipedia.org/wiki/Q-learning) - A value-based reinforcement learning algorithm.
- [Deep Q-Network (DQN)](https://arxiv.org/abs/1312.5602) - A deep learning approach for reinforcement learning tasks.
- [Proximal Policy Optimization (PPO)](https://arxiv.org/abs/1707.06347) - A policy gradient method for reinforcement learning.
- [Actor-Critic Methods](https://arxiv.org/abs/1602.01783) - A family of reinforcement learning algorithms that use both policy and value functions.
- [OpenAI Gym](https://www.gymlibrary.dev/) - A toolkit for developing and comparing reinforcement learning algorithms.

## Datasets

- [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/index.php) - A collection of datasets for machine learning research.
- [Kaggle Datasets](https://www.kaggle.com/datasets) - A platform for accessing diverse datasets and participating in competitions.
- [Google Dataset Search](https://datasetsearch.research.google.com/) - A search engine for discovering datasets across the web.
- [OpenML](https://www.openml.org/) - An open platform for sharing datasets and machine learning experiments.
- [Data.gov](https://www.data.gov/) - A portal for accessing public datasets.

## Research Papers

- [A Few Useful Things to Know About Machine Learning (2012)](https://dl.acm.org/doi/10.1145/2347736.2347755) - A paper discussing important concepts in machine learning.
- [The Elements of Statistical Learning (2001)](https://hastie.su.domains/ElemStatLearn/) - A comprehensive book on statistical learning.
- [Gradient Boosting Machine Learning (2001)](https://link.springer.com/article/10.1023/A:1010933404324) - The original paper introducing Gradient Boosting.

## Learning Resources

- [Coursera: Machine Learning by Andrew Ng](https://www.coursera.org/learn/machine-learning) - A comprehensive course on machine learning.
- [Fast.ai](https://www.fast.ai/) - Free courses and resources for practical machine learning.
- [Google Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course) - A fast-paced introduction to machine learning.

## Books

- *Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow* by Aurélien Géron - A practical guide to machine learning.
- *Pattern Recognition and Machine Learning* by Christopher Bishop - A book covering the fundamentals of machine learning.
- *Machine Learning Yearning* by Andrew Ng - A guide on structuring machine learning projects effectively.

## Community

- [Reddit: r/MachineLearning](https://www.reddit.com/r/MachineLearning/) - A subreddit for discussions on machine learning.
- [Kaggle](https://www.kaggle.com/) - A platform for data science competitions and community interaction.
- [Scikit-learn Mailing List](https://mail.python.org/mailman/listinfo/scikit-learn) - A place to discuss issues and features in scikit-learn.

## Contribute

Contributions are welcome!

## License

[![CC0](https://mirrors.creativecommons.org/presskit/buttons/88x31/svg/by-sa.svg)](http://creativecommons.org/licenses/by-sa/4.0/)