awesome-machine-learning
A curated list of awesome Machine Learning frameworks, libraries and software.
https://github.com/eric-erki/awesome-machine-learning
Last synced: 13 days ago
JSON representation
-
Python
-
General-Purpose Machine Learning
- metric-learn - A Python module for metric learning.
- windML - A Python Framework for Wind Energy Analysis and Prediction.
- open-solution-home-credit - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Home-Credit-Default-Risk) for [Home Credit Default Risk](https://www.kaggle.com/c/home-credit-default-risk).
- open-solution-googleai-object-detection - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Google-AI-Object-Detection-Challenge) for [Google AI Open Images - Object Detection Track](https://www.kaggle.com/c/google-ai-open-images-object-detection-track).
- open-solution-ship-detection - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Ships) for [Airbus Ship Detection Challenge](https://www.kaggle.com/c/airbus-ship-detection).
- open-solution-data-science-bowl-2018 - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Data-Science-Bowl-2018) for [2018 Data Science Bowl](https://www.kaggle.com/c/data-science-bowl-2018).
- open-solution-value-prediction - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Santander-Value-Prediction-Challenge) for [Santander Value Prediction Challenge](https://www.kaggle.com/c/santander-value-prediction-challenge).
- open-solution-toxic-comments - > source code for [Toxic Comment Classification Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge).
- TPOT - Tool that automatically creates and optimizes machine learning pipelines using genetic programming. Consider it your personal data science assistant, automating a tedious part of machine learning.
- KoNLPy - A Python package for Korean natural language processing.
-
-
R
-
General-Purpose Machine Learning
- ahaz - ahaz: Regularization for semiparametric additive hazards regression.
- arules - arules: Mining Association Rules and Frequent Itemsets
- biglasso - biglasso: Extending Lasso Model Fitting to Big Data in R.
- bigrf - bigrf: Big Random Forests: Classification and Regression Forests for Large Data Sets.
- bigRR - bigRR: Generalized Ridge Regression (with special advantage for p >> n cases).
- bmrm - bmrm: Bundle Methods for Regularized Risk Minimization Package.
- Boruta - Boruta: A wrapper algorithm for all-relevant feature selection.
- bst - bst: Gradient Boosting.
- C50 - C50: C5.0 Decision Trees and Rule-Based Models.
- caret - Classification and Regression Training: Unified interface to ~150 ML algorithms in R.
- caretEnsemble - caretEnsemble: Framework for fitting multiple caret models as well as creating ensembles of such models.
- CORElearn - CORElearn: Classification, regression, feature evaluation and ordinal evaluation.
- CoxBoost - CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks
- Cubist - Cubist: Rule- and Instance-Based Regression Modeling.
- e1071 - e1071: Misc Functions of the Department of Statistics (e1071), TU Wien
- earth - earth: Multivariate Adaptive Regression Spline Models
- elasticnet - elasticnet: Elastic-Net for Sparse Estimation and Sparse PCA.
- ElemStatLearn - ElemStatLearn: Data sets, functions and examples from the book: "The Elements of Statistical Learning, Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman.
- evtree - evtree: Evolutionary Learning of Globally Optimal Trees.
- forecast - forecast: Timeseries forecasting using ARIMA, ETS, STLM, TBATS, and neural network models.
- forecastHybrid - forecastHybrid: Automatic ensemble and cross validation of ARIMA, ETS, STLM, TBATS, and neural network models from the "forecast" package.
- fpc - fpc: Flexible procedures for clustering.
- frbs - frbs: Fuzzy Rule-based Systems for Classification and Regression Tasks.
- GAMBoost - GAMBoost: Generalized linear and additive models by likelihood based boosting.
- gamboostLSS - gamboostLSS: Boosting Methods for GAMLSS.
- gbm - gbm: Generalized Boosted Regression Models.
- glmnet - glmnet: Lasso and elastic-net regularized generalized linear models.
- glmpath - glmpath: L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model.
- GMMBoost - GMMBoost: Likelihood-based Boosting for Generalized mixed models.
- grplasso - grplasso: Fitting user specified models with Group Lasso penalty.
- grpreg - grpreg: Regularization paths for regression models with grouped covariates.
- h2o - A framework for fast, parallel, and distributed machine learning algorithms at scale -- Deeplearning, Random forests, GBM, KMeans, PCA, GLM.
- hda - hda: Heteroscedastic Discriminant Analysis.
- Introduction to Statistical Learning
- ipred - ipred: Improved Predictors.
- kernlab - kernlab: Kernel-based Machine Learning Lab.
- klaR - klaR: Classification and visualization.
- lars - lars: Least Angle Regression, Lasso and Forward Stagewise.
- lasso2 - lasso2: L1 constrained estimation aka ‘lasso’.
- LiblineaR - LiblineaR: Linear Predictive Models Based On The Liblinear C/C++ Library.
- LogicReg - LogicReg: Logic Regression.
- maptree - maptree: Mapping, pruning, and graphing tree models.
- mboost - mboost: Model-Based Boosting.
- mlr - mlr: Machine Learning in R.
- mvpart - mvpart: Multivariate partitioning.
- ncvreg - ncvreg: Regularization paths for SCAD- and MCP-penalized regression models.
- nnet - nnet: Feed-forward Neural Networks and Multinomial Log-Linear Models.
- oblique.tree - oblique.tree: Oblique Trees for Classification Data.
- pamr - pamr: Pam: prediction analysis for microarrays.
- party - party: A Laboratory for Recursive Partytioning.
- partykit - partykit: A Toolkit for Recursive Partytioning.
- penalized - penalized: L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model.
- penalizedLDA - penalizedLDA: Penalized classification using Fisher's linear discriminant.
- penalizedSVM - penalizedSVM: Feature Selection SVM using penalty functions.
- quantregForest - quantregForest: Quantile Regression Forests.
- randomForest - randomForest: Breiman and Cutler's random forests for classification and regression.
- randomForestSRC - randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC).
- rattle - rattle: Graphical user interface for data mining in R.
- rda - rda: Shrunken Centroids Regularized Discriminant Analysis.
- rdetools - rdetools: Relevant Dimension Estimation (RDE) in Feature Spaces.
- REEMtree - REEMtree: Regression Trees with Random Effects for Longitudinal (Panel) Data.
- relaxo - relaxo: Relaxed Lasso.
- rgenoud - rgenoud: R version of GENetic Optimization Using Derivatives
- rgp - rgp: R genetic programming framework.
- Rmalschains - Rmalschains: Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in R.
- rminer - rminer: Simpler use of data mining methods (e.g. NN and SVM) in classification and regression.
- ROCR - ROCR: Visualizing the performance of scoring classifiers.
- RoughSets - RoughSets: Data Analysis Using Rough Set and Fuzzy Rough Set Theories.
- rpart - rpart: Recursive Partitioning and Regression Trees.
- RPMM - RPMM: Recursively Partitioned Mixture Model.
- RSNNS - RSNNS: Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS).
- RWeka - RWeka: R/Weka interface.
- RXshrink - RXshrink: Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regression.
- sda - sda: Shrinkage Discriminant Analysis and CAT Score Variable Selection.
- SDDA - SDDA: Stepwise Diagonal Discriminant Analysis.
- svmpath - svmpath: svmpath: the SVM Path algorithm.
- tgp - tgp: Bayesian treed Gaussian process models.
- tree - tree: Classification and regression trees.
- varSelRF - varSelRF: Variable selection using random forests.
- XGBoost.R - R binding for eXtreme Gradient Boosting (Tree) Library.
- ggplot2 - A data visualization package based on the grammar of graphics.
- bigrf - bigrf: Big Random Forests: Classification and Regression Forests for Large Data Sets.
- bigRR - bigRR: Generalized Ridge Regression (with special advantage for p >> n cases).
- caret - Classification and Regression Training: Unified interface to ~150 ML algorithms in R.
- Machine Learning For Hackers
- mvpart - mvpart: Multivariate partitioning.
- oblique.tree - oblique.tree: Oblique Trees for Classification Data.
- rgp - rgp: R genetic programming framework.
- SDDA - SDDA: Stepwise Diagonal Discriminant Analysis.
- SuperLearner - project.org/web/packages/subsemble/index.html) - Multi-algorithm ensemble learning packages.
- TDSP-Utilities - Two data science utilities in R from Microsoft: 1) Interactive Data Exploration, Analysis, and Reporting (IDEAR) ; 2) Automated Modeling and Reporting (AMR).
- Clever Algorithms For Machine Learning
-
-
Ruby
-
General-Purpose Machine Learning
- Awesome NLP with Ruby - Curated link list for practical natural language processing in Ruby.
- Raspel - raspell is an interface binding for ruby.
- Awesome Machine Learning with Ruby - Curated list of ML related resources for Ruby.
- ruby-plot - gnuplot wrapper for Ruby, especially for plotting ROC curves into SVG files.
- scruffy - A beautiful graphing toolkit for Ruby.
- SciRuby
- Treat - Text REtrieval and Annotation Toolkit, definitely the most comprehensive toolkit I’ve encountered so far for Ruby.
- Stemmer - Expose libstemmer_c to Ruby.
- UEA Stemmer - Ruby port of UEALite Stemmer - a conservative stemmer for search and indexing.
- Twitter-text-rb - A library that does auto linking and extraction of usernames, lists and hashtags in tweets.
- Ruby Machine Learning - Some Machine Learning algorithms, implemented in Ruby.
- Machine Learning Ruby
- jRuby Mahout - JRuby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby.
- CardMagic-Classifier - A general classifier module to allow Bayesian and other types of classifications.
- rb-libsvm - Ruby language bindings for LIBSVM which is a Library for Support Vector Machines.
- rsruby - Ruby - R bridge.
- data-visualization-ruby - Source code and supporting content for my Ruby Manor presentation on Data Visualisation with Ruby.
- plot-rb - A plotting library in Ruby built on top of Vega and D3.
- SciRuby
- Glean - A data management tool for humans.
- Bioruby
- Arel
- Big Data For Chimps
- Listof - Community based data collection, packed in gem. Get list of pretty much anything (stop words, countries, non words) in txt, json or hash. [Demo/Search for a list](http://kevincobain2000.github.io/listof/)
- scruffy - A beautiful graphing toolkit for Ruby.
- Twitter-text-rb - A library that does auto linking and extraction of usernames, lists and hashtags in tweets.
- Ruby Linguistics - Linguistics is a framework for building linguistic utilities for Ruby objects in any language. It includes a generic language-independent front end, a module for mapping language codes into language names, and a module which contains various English-language utilities.
- Random Forester - Creates Random Forest classifiers from PMML files.
- Ruby Wordnet - This library is a Ruby interface to WordNet.
-
-
Rust
-
General-Purpose Machine Learning
- deeplearn-rs - deeplearn-rs provides simple networks that use matrix multiplication, addition, and ReLU under the MIT license.
- rustlearn - a machine learning framework featuring logistic regression, support vector machines, decision trees and random forests.
- rusty-machine - a pure-rust machine learning library.
- leaf - open source framework for machine intelligence, sharing concepts from TensorFlow and Caffe. Available under the MIT license. [**[Deprecated]**](https://medium.com/@mjhirn/tensorflow-wins-89b78b29aafb#.s0a3uy4cc)
- RustNN - RustNN is a feedforward neural network library.
- RusticSOM - A Rust library for Self Organising Maps (SOM).
-
-
SAS
-
General-Purpose Machine Learning
- Enterprise Miner - Data mining and machine learning that creates deployable models using a GUI or code.
- High Performance Text Mining - Text mining using a GUI or code in an MPP environment, including Hadoop.
- ML_Tables - Concise cheat sheets containing machine learning best practices.
- Factory Miner - Automatically creates deployable machine learning models across numerous market or customer segments using a GUI.
- Factory Miner - Automatically creates deployable machine learning models across numerous market or customer segments using a GUI.
- Text Miner - Text mining using a GUI or code.
- enlighten-apply - Example code and materials that illustrate applications of SAS machine learning techniques.
- enlighten-integration - Example code and materials that illustrate techniques for integrating SAS with other analytics technologies in Java, PMML, Python and R.
- enlighten-deep - Example code and materials that illustrate using neural networks with several hidden layers in SAS.
- Factory Miner - Automatically creates deployable machine learning models across numerous market or customer segments using a GUI.
- High Performance Text Mining - Text mining using a GUI or code in an MPP environment, including Hadoop.
- Visual Data Mining and Machine Learning - Interactive, automated, and programmatic modeling with the latest machine learning algorithms in and end-to-end analytics environment, from data prep to deployment. Free trial available.
- SAS/STAT - For conducting advanced statistical analysis.
- University Edition - FREE! Includes all SAS packages necessary for data analysis and visualization, and includes online SAS courses.
- High Performance Data Mining - Data mining and machine learning that creates deployable models using a GUI or code in an MPP environment, including Hadoop.
- Contextual Analysis - Add structure to unstructured text using a GUI.
- Sentiment Analysis - Extract sentiment from text using a GUI.
- Text Miner - Text mining using a GUI or code.
-
-
Scala
-
General-Purpose Machine Learning
- ScalaNLP - ScalaNLP is a suite of machine learning and numerical computing libraries.
- DeepLearning.scala - Creating statically typed dynamic neural networks from object-oriented & functional programming constructs.
- ScalaNLP - ScalaNLP is a suite of machine learning and numerical computing libraries.
- Breeze - Breeze is a numerical processing library for Scala.
- Chalk - Chalk is a natural language processing library.
- FACTORIE - FACTORIE is a toolkit for deployable probabilistic modeling, implemented as a software library in Scala. It provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference.
- Montague - Montague is a semantic parsing library for Scala with an easy-to-use DSL.
- Scalding - A Scala API for Cascading.
- Summing Bird - Streaming MapReduce with Scalding and Storm.
- Algebird - Abstract Algebra for Scala.
- xerial - Data management utilities for Scala.
- PredictionIO - PredictionIO, a machine learning server for software developers and data engineers.
- BIDMat - CPU and GPU-accelerated matrix library intended to support large-scale exploratory data analysis.
- Spark Notebook - Interactive and Reactive Data Science using Scala and Spark.
- Conjecture - Scalable Machine Learning in Scalding.
- brushfire - Distributed decision tree ensemble learning in Scala.
- ganitha - Scalding powered machine learning.
- adam - A genomics processing engine and specialized file format built using Apache Avro, Apache Spark and Parquet. Apache 2 licensed.
- bioscala - Bioinformatics for the Scala programming language
- BIDMach - CPU and GPU-accelerated Machine Learning Library.
- H2O Sparkling Water - H2O and Spark interoperability.
- doddle-model - An in-memory machine learning library built on top of Breeze. It provides immutable objects and exposes its functionality through a scikit-learn-like API.
- Figaro - a Scala library for constructing probabilistic models.
- DynaML - Scala Library/REPL for Machine Learning Research.
-
-
Swift
-
General-Purpose Machine Learning
- Bender - Fast Neural Networks framework built on top of Metal. Supports TensorFlow models.
- DeepLearningKit
- AIToolbox - A toolbox framework of AI modules written in Swift: Graphs/Trees, Linear Regression, Support Vector Machines, Neural Networks, PCA, KMeans, Genetic Algorithms, MDP, Mixture of Gaussians.
- Swift Brain - The first neural network / machine learning library written in Swift. This is a project for AI algorithms in Swift for iOS and OS X development. This project includes algorithms focused on Bayes theorem, neural networks, SVMs, Matrices, etc...
- Awesome Core ML Models - A curated list of machine learning models in CoreML format.
- swix - A bare bones library that
- MLKit - A simple Machine Learning Framework written in Swift. Currently features Simple Linear Regression, Polynomial Regression, and Ridge Regression.
- Perfect TensorFlow - Swift Language Bindings of TensorFlow. Using native TensorFlow models on both macOS / Linux.
- PredictionBuilder - A library for machine learning that builds predictions using a linear regression.
- BrainCore - The iOS and OS X neural network framework.
-
-
TensorFlow
-
General-Purpose Machine Learning
- Awesome TensorFlow - A list of all things related to TensorFlow.
- Golden TensorFlow - A page of content on TensorFlow, including academic papers and links to related topics.
-
Programming Languages
Categories
Sub Categories
Keywords
machine-learning
80
python
52
data-science
27
deep-learning
24
nlp
18
statistics
14
julia
14
natural-language-processing
12
java
11
ml
11
neural-network
11
artificial-intelligence
10
scikit-learn
10
regression
9
big-data
8
javascript
8
c-plus-plus
8
computational-linguistics
7
neural-networks
7
visualization
7
genetic-algorithm
6
spark
6
data-mining
6
linear-regression
5
machine-learning-algorithms
5
classification
5
swift
5
scala
5
library
5
folia
5
tensorflow
5
ruby
5
r
5
gpu
5
clojure
4
math
4
bayesian
4
hadoop
4
kaggle
4
data-analysis
4
gbm
4
awesome
4
svm
4
ai
4
text-processing
4
machine-learning-library
4
sentiment-analysis
4
nodejs
4
classifier
4
computer-vision
4