awesome-machine-learning
A curated list of awesome Machine Learning frameworks, libraries and software.
https://github.com/ml13571/awesome-machine-learning
Last synced: 10 days ago
JSON representation
-
Python
-
General-Purpose Machine Learning
- Kaggle Dogs vs. Cats - Code for Kaggle Dogs vs. Cats competition.
- Kaggle Galaxy Challenge - Winning solution for the Galaxy Challenge on Kaggle.
- Kaggle Gender - A Kaggle competition: discriminate gender based on handwriting.
- Kaggle Merck - Merck challenge at Kaggle.
- Kaggle Stackoverflow - Predicting closed questions on Stack Overflow.
- wine-quality - Predicting wine quality.
- DeepMind Lab - DeepMind Lab is a 3D learning environment based on id Software's Quake III Arena via ioquake3 and other open source software. Its primary purpose is to act as a testbed for research in artificial intelligence, especially deep reinforcement learning.
- Gymnasium - A library for developing and comparing reinforcement learning algorithms (successor of [gym])(https://github.com/openai/gym).
- Serpent.AI - Serpent.AI is a game agent framework that allows you to turn any video game you own into a sandbox to develop AI and machine learning experiments. For both researchers and hobbyists.
- ViZDoom - ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is primarily intended for research in machine visual learning, and deep reinforcement learning, in particular.
- Roboschool - Open-source software for robot simulation, integrated with OpenAI Gym.
- Retro - Retro Games in Gym
- SLM Lab - Modular Deep Reinforcement Learning framework in PyTorch.
- garage - A toolkit for reproducible reinforcement learning research
- metaworld - An open source robotics benchmark for meta- and multi-task reinforcement learning
- acme - An Open Source Distributed Framework for Reinforcement Learning that makes build and train your agents easily.
- Maze - Application-oriented deep reinforcement learning framework addressing real-world decision problems.
- RLlib - RLlib is an industry level, highly scalable RL library for tf and torch, based on Ray. It's used by companies like Amazon and Microsoft to solve real-world decision making problems at scale.
- DI-engine - DI-engine is a generalized Decision Intelligence engine. It supports most basic deep reinforcement learning (DRL) algorithms, such as DQN, PPO, SAC, and domain-specific algorithms like QMIX in multi-agent RL, GAIL in inverse RL, and RND in exploration problems.
- EspNet - ESPnet is an end-to-end speech processing toolkit for tasks like speech recognition, translation, and enhancement, using PyTorch and Kaldi-style data processing.
- BLLIP Parser - Python bindings for the BLLIP Natural Language Parser (also known as the Charniak-Johnson parser). **[Deprecated]**
- editdistance - fast implementation of edit distance.
- graphlab-create - A library with various machine learning models (regression, clustering, recommender systems, graph analytics, etc.) implemented on top of a disk-backed DataFrame.
- Coach - Reinforcement Learning Coach by Intel® AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms
- TextBlob - Providing a consistent API for diving into common natural language processing (NLP) tasks. Stands on the giant shoulders of NLTK and Pattern, and plays nicely with both.
- pygal - A Python SVG Charts Creator.
- albumentations - А fast and framework agnostic image augmentation library that implements a diverse set of augmentation techniques. Supports classification, segmentation, detection out of the box. Was used to win a number of Deep Learning competitions at Kaggle, Topcoder and those that were a part of the CVPR workshops.
- Pebl - Python Environment for Bayesian Learning. **[Deprecated]**
- steppy-toolkit - > Curated collection of the neural networks, transformers and models that make your machine learning work faster and more effective.
- Theano - Optimizing GPU-meta-programming code generating array oriented optimizing math compiler in Python.
- TensorFlow - Open source software library for numerical computation using data flow graphs.
- Microsoft Recommenders
- pycascading
- Superset - A data exploration platform designed to be visual, intuitive, and interactive.
- NuPIC - Numenta Platform for Intelligent Computing.
- JAX - JAX is Autograd and XLA, brought together for high-performance machine learning research.
- FEDOT - modal datasets).
- bqplot - An API for plotting in Jupyter (IPython).
- Suiron - Machine Learning for RC Cars.
- imutils - A library containing Convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.
- timm - PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more.
- segmentation_models.pytorch - A PyTorch-based toolkit that offers pre-trained segmentation models for computer vision tasks. It simplifies the development of image segmentation applications by providing a collection of popular architecture implementations, such as UNet and PSPNet, along with pre-trained weights, making it easier for researchers and developers to achieve high-quality pixel-level object segmentation in images.
- DeepPavlov - conversational AI library with many pre-trained Russian NLP models.
- Microsoft ML for Apache Spark - > A distributed machine learning framework Apache Spark
- PyTorch Lightning Bolts - Toolbox of models, callbacks, and datasets for AI/ML researchers.
- PyGrid - Peer-to-peer network of data owners and data scientists who can collectively train AI models using PySyft
- sktime - A unified framework for machine learning with time series
- TResNet: High Performance GPU-Dedicated Architecture - TResNet models were designed and optimized to give the best speed-accuracy tradeoff out there on GPUs.
- open-solution-home-credit - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Home-Credit-Default-Risk) for [Home Credit Default Risk](https://www.kaggle.com/c/home-credit-default-risk).
- open-solution-googleai-object-detection - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Google-AI-Object-Detection-Challenge) for [Google AI Open Images - Object Detection Track](https://www.kaggle.com/c/google-ai-open-images-object-detection-track).
- open-solution-ship-detection - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Ships) for [Airbus Ship Detection Challenge](https://www.kaggle.com/c/airbus-ship-detection).
- open-solution-data-science-bowl-2018 - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Data-Science-Bowl-2018) for [2018 Data Science Bowl](https://www.kaggle.com/c/data-science-bowl-2018).
- open-solution-value-prediction - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Santander-Value-Prediction-Challenge) for [Santander Value Prediction Challenge](https://www.kaggle.com/c/santander-value-prediction-challenge).
- open-solution-toxic-comments - > source code for [Toxic Comment Classification Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge).
- Intel(R) Extension for Scikit-learn - A seamless way to speed up your Scikit-learn applications with no accuracy loss and code changes.
- Thampi - Machine Learning Prediction System on AWS Lambda
- CometLLM - Track, log, visualize and evaluate your LLM prompts and prompt chains.
-
-
R
-
General-Purpose Machine Learning
- ahaz - ahaz: Regularization for semiparametric additive hazards regression. **[Deprecated]**
- arules - arules: Mining Association Rules and Frequent Itemsets
- biglasso - biglasso: Extending Lasso Model Fitting to Big Data in R.
- bmrm - bmrm: Bundle Methods for Regularized Risk Minimization Package.
- Boruta - Boruta: A wrapper algorithm for all-relevant feature selection.
- bst - bst: Gradient Boosting.
- C50 - C50: C5.0 Decision Trees and Rule-Based Models.
- caret - Classification and Regression Training: Unified interface to ~150 ML algorithms in R.
- caretEnsemble - caretEnsemble: Framework for fitting multiple caret models as well as creating ensembles of such models. **[Deprecated]**
- Clever Algorithms For Machine Learning
- CORElearn - CORElearn: Classification, regression, feature evaluation and ordinal evaluation.
- CoxBoost - CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks **[Deprecated]**
- Cubist - Cubist: Rule- and Instance-Based Regression Modelling.
- e1071 - e1071: Misc Functions of the Department of Statistics (e1071), TU Wien
- earth - earth: Multivariate Adaptive Regression Spline Models
- elasticnet - elasticnet: Elastic-Net for Sparse Estimation and Sparse PCA.
- ElemStatLearn - ElemStatLearn: Data sets, functions and examples from the book: "The Elements of Statistical Learning, Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman.
- evtree - evtree: Evolutionary Learning of Globally Optimal Trees.
- forecast - forecast: Timeseries forecasting using ARIMA, ETS, STLM, TBATS, and neural network models.
- forecastHybrid - forecastHybrid: Automatic ensemble and cross validation of ARIMA, ETS, STLM, TBATS, and neural network models from the "forecast" package.
- fpc - fpc: Flexible procedures for clustering.
- frbs - frbs: Fuzzy Rule-based Systems for Classification and Regression Tasks. **[Deprecated]**
- GAMBoost - GAMBoost: Generalized linear and additive models by likelihood based boosting. **[Deprecated]**
- gamboostLSS - gamboostLSS: Boosting Methods for GAMLSS.
- gbm - gbm: Generalized Boosted Regression Models.
- glmnet - glmnet: Lasso and elastic-net regularized generalized linear models.
- glmpath - glmpath: L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model.
- GMMBoost - GMMBoost: Likelihood-based Boosting for Generalized mixed models. **[Deprecated]**
- grplasso - grplasso: Fitting user specified models with Group Lasso penalty.
- grpreg - grpreg: Regularization paths for regression models with grouped covariates.
- h2o - A framework for fast, parallel, and distributed machine learning algorithms at scale -- Deeplearning, Random forests, GBM, KMeans, PCA, GLM.
- hda - hda: Heteroscedastic Discriminant Analysis. **[Deprecated]**
- Introduction to Statistical Learning
- ipred - ipred: Improved Predictors.
- kernlab - kernlab: Kernel-based Machine Learning Lab.
- klaR - klaR: Classification and visualization.
- L0Learn - L0Learn: Fast algorithms for best subset selection.
- lars - lars: Least Angle Regression, Lasso and Forward Stagewise. **[Deprecated]**
- lasso2 - lasso2: L1 constrained estimation aka ‘lasso’.
- LiblineaR - LiblineaR: Linear Predictive Models Based On The Liblinear C/C++ Library.
- LogicReg - LogicReg: Logic Regression.
- Machine Learning For Hackers
- maptree - maptree: Mapping, pruning, and graphing tree models. **[Deprecated]**
- mboost - mboost: Model-Based Boosting.
- mlr - mlr: Machine Learning in R.
- ncvreg - ncvreg: Regularization paths for SCAD- and MCP-penalized regression models.
- nnet - nnet: Feed-forward Neural Networks and Multinomial Log-Linear Models. **[Deprecated]**
- pamr - pamr: Pam: prediction analysis for microarrays. **[Deprecated]**
- party - party: A Laboratory for Recursive Partitioning
- partykit - partykit: A Toolkit for Recursive Partitioning.
- penalized - penalized: L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model.
- penalizedLDA - penalizedLDA: Penalized classification using Fisher's linear discriminant. **[Deprecated]**
- penalizedSVM - penalizedSVM: Feature Selection SVM using penalty functions.
- quantregForest - quantregForest: Quantile Regression Forests.
- randomForest - randomForest: Breiman and Cutler's random forests for classification and regression.
- randomForestSRC - randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC).
- rattle - rattle: Graphical user interface for data mining in R.
- rda - rda: Shrunken Centroids Regularized Discriminant Analysis.
- rdetools - rdetools: Relevant Dimension Estimation (RDE) in Feature Spaces. **[Deprecated]**
- REEMtree - REEMtree: Regression Trees with Random Effects for Longitudinal (Panel) Data. **[Deprecated]**
- relaxo - relaxo: Relaxed Lasso. **[Deprecated]**
- rgenoud - rgenoud: R version of GENetic Optimization Using Derivatives
- Rmalschains - Rmalschains: Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in R.
- rminer - rminer: Simpler use of data mining methods (e.g. NN and SVM) in classification and regression. **[Deprecated]**
- ROCR - ROCR: Visualizing the performance of scoring classifiers. **[Deprecated]**
- RoughSets - RoughSets: Data Analysis Using Rough Set and Fuzzy Rough Set Theories. **[Deprecated]**
- rpart - rpart: Recursive Partitioning and Regression Trees.
- RPMM - RPMM: Recursively Partitioned Mixture Model.
- RSNNS - RSNNS: Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS).
- RWeka - RWeka: R/Weka interface.
- RXshrink - RXshrink: Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regression.
- sda - sda: Shrinkage Discriminant Analysis and CAT Score Variable Selection. **[Deprecated]**
- spectralGraphTopology - spectralGraphTopology: Learning Graphs from Data via Spectral Constraints.
- SuperLearner - Multi-algorithm ensemble learning packages.
- svmpath - svmpath: svmpath: the SVM Path algorithm. **[Deprecated]**
- tgp - tgp: Bayesian treed Gaussian process models. **[Deprecated]**
- tree - tree: Classification and regression trees.
- varSelRF - varSelRF: Variable selection using random forests.
- XGBoost.R - R binding for eXtreme Gradient Boosting (Tree) Library.
- igraph - binding to igraph library - General purpose graph library.
- TDSP-Utilities - Two data science utilities in R from Microsoft: 1) Interactive Data Exploration, Analysis, and Reporting (IDEAR) ; 2) Automated Modelling and Reporting (AMR).
- dplyr - A data manipulation package that helps to solve the most common data manipulation problems.
- ggplot2 - A data visualization package based on the grammar of graphics.
- tmap
- tm
- shiny
- tm
-
-
Ruby
-
General-Purpose Machine Learning
- Awesome NLP with Ruby - Curated link list for practical natural language processing in Ruby.
- Treat - Text Retrieval and Annotation Toolkit, definitely the most comprehensive toolkit I’ve encountered so far for Ruby.
- Stemmer - Expose libstemmer_c to Ruby. **[Deprecated]**
- Raspell - raspell is an interface binding for ruby. **[Deprecated]**
- UEA Stemmer - Ruby port of UEALite Stemmer - a conservative stemmer for search and indexing.
- Twitter-text-rb - A library that does auto linking and extraction of usernames, lists and hashtags in tweets.
- Awesome Machine Learning with Ruby - Curated list of ML related resources for Ruby.
- Ruby Machine Learning - Some Machine Learning algorithms, implemented in Ruby. **[Deprecated]**
- Machine Learning Ruby
- jRuby Mahout - JRuby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby. **[Deprecated]**
- CardMagic-Classifier - A general classifier module to allow Bayesian and other types of classifications.
- rb-libsvm - Ruby language bindings for LIBSVM which is a Library for Support Vector Machines.
- Scoruby - Creates Random Forest classifiers from PMML files.
- rumale - Rumale is a machine learning library in Ruby
- rsruby - Ruby - R bridge.
- data-visualization-ruby - Source code and supporting content for my Ruby Manor presentation on Data Visualisation with Ruby. **[Deprecated]**
- ruby-plot - gnuplot wrapper for Ruby, especially for plotting ROC curves into SVG files. **[Deprecated]**
- plot-rb - A plotting library in Ruby built on top of Vega and D3. **[Deprecated]**
- scruffy - A beautiful graphing toolkit for Ruby.
- SciRuby
- Glean - A data management tool for humans. **[Deprecated]**
- Bioruby
- Arel
- Big Data For Chimps
- Listof - Community based data collection, packed in gem. Get list of pretty much anything (stop words, countries, non words) in txt, JSON or hash. [Demo/Search for a list](http://kevincobain2000.github.io/listof/)
-
-
Rust
-
General-Purpose Machine Learning
- smartcore - "The Most Advanced Machine Learning Library In Rust."
- linfa - a comprehensive toolkit to build Machine Learning applications with Rust
- deeplearn-rs - deeplearn-rs provides simple networks that use matrix multiplication, addition, and ReLU under the MIT license.
- rustlearn - a machine learning framework featuring logistic regression, support vector machines, decision trees and random forests.
- rusty-machine - a pure-rust machine learning library.
- leaf - open source framework for machine intelligence, sharing concepts from TensorFlow and Caffe. Available under the MIT license. [**[Deprecated]**](https://medium.com/@mjhirn/tensorflow-wins-89b78b29aafb#.s0a3uy4cc)
- RustNN - RustNN is a feedforward neural network library. **[Deprecated]**
- RusticSOM - A Rust library for Self Organising Maps (SOM).
- candle - Candle is a minimalist ML framework for Rust with a focus on performance (including GPU support) and ease of use.
- tch-rs - Rust bindings for the C++ API of PyTorch
- dfdx - Deep learning in Rust, with shape checked tensors and neural networks
- burn - Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals
- huggingface/tokenizers - Fast State-of-the-Art Tokenizers optimized for Research and Production
- rust-bert - Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
-
-
SAS
-
General-Purpose Machine Learning
- Visual Data Mining and Machine Learning - Interactive, automated, and programmatic modelling with the latest machine learning algorithms in and end-to-end analytics environment, from data prep to deployment. Free trial available.
- Enterprise Miner - Data mining and machine learning that creates deployable models using a GUI or code.
- Factory Miner - Automatically creates deployable machine learning models across numerous market or customer segments using a GUI.
- SAS/STAT - For conducting advanced statistical analysis.
- Text Miner - Text mining using a GUI or code.
- ML_Tables - Concise cheat sheets containing machine learning best practices.
- enlighten-apply - Example code and materials that illustrate applications of SAS machine learning techniques.
- enlighten-integration - Example code and materials that illustrate techniques for integrating SAS with other analytics technologies in Java, PMML, Python and R.
- enlighten-deep - Example code and materials that illustrate using neural networks with several hidden layers in SAS.
- Factory Miner - Automatically creates deployable machine learning models across numerous market or customer segments using a GUI.
- University Edition - FREE! Includes all SAS packages necessary for data analysis and visualization, and includes online SAS courses.
-
-
Scala
-
General-Purpose Machine Learning
- ScalaNLP - ScalaNLP is a suite of machine learning and numerical computing libraries.
- Breeze - Breeze is a numerical processing library for Scala.
- Chalk - Chalk is a natural language processing library. **[Deprecated]**
- FACTORIE - FACTORIE is a toolkit for deployable probabilistic modelling, implemented as a software library in Scala. It provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference.
- Montague - Montague is a semantic parsing library for Scala with an easy-to-use DSL.
- Spark NLP - Natural language processing library built on top of Apache Spark ML to provide simple, performant, and accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment.
-
Programming Languages
Categories
Python
384
R
87
JavaScript
72
Julia
59
C++
57
Java
53
Go
45
Lua
44
Clojure
44
Tools
29
Scala
28
Ruby
25
Matlab
18
Rust
14
.NET
13
Swift
12
SAS
11
C
8
Objective C
8
Common Lisp
7
Books
7
Perl
6
Haskell
6
PHP
5
OCaml
4
Elixir
4
Perl 6
3
TensorFlow
3
Crystal
2
Credits
2
Fortran
2
OpenCV
1
Scheme
1
Erlang
1
Kotlin
1
APL
1
Sub Categories
Keywords
machine-learning
221
python
125
deep-learning
98
data-science
73
pytorch
47
nlp
35
neural-network
35
tensorflow
30
scikit-learn
29
artificial-intelligence
26
natural-language-processing
25
ml
21
julia
20
ai
19
clojure
19
neural-networks
18
statistics
18
java
17
reinforcement-learning
15
hyperparameter-optimization
15
gpu
15
computer-vision
15
go
14
visualization
14
regression
14
machine-learning-algorithms
13
random-forest
12
automl
12
data-mining
12
numpy
12
c-plus-plus
12
keras
12
data-analysis
12
javascript
12
deep-neural-networks
12
jupyter-notebook
11
golang
11
xgboost
11
big-data
11
classification
10
cuda
10
spark
10
scala
10
llm
10
jupyter
10
r
9
rust
9
machine-learning-library
8
feature-engineering
8
named-entity-recognition
8