An open API service indexing awesome lists of open source software.

awesome-machine-learning

A curated list of awesome Machine Learning frameworks, libraries and software.
https://github.com/eric-erki/awesome-machine-learning

Last synced: 13 days ago
JSON representation

  • Python

    • General-Purpose Machine Learning

      • metric-learn - A Python module for metric learning.
      • windML - A Python Framework for Wind Energy Analysis and Prediction.
      • open-solution-home-credit - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Home-Credit-Default-Risk) for [Home Credit Default Risk](https://www.kaggle.com/c/home-credit-default-risk).
      • open-solution-googleai-object-detection - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Google-AI-Object-Detection-Challenge) for [Google AI Open Images - Object Detection Track](https://www.kaggle.com/c/google-ai-open-images-object-detection-track).
      • open-solution-ship-detection - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Ships) for [Airbus Ship Detection Challenge](https://www.kaggle.com/c/airbus-ship-detection).
      • open-solution-data-science-bowl-2018 - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Data-Science-Bowl-2018) for [2018 Data Science Bowl](https://www.kaggle.com/c/data-science-bowl-2018).
      • open-solution-value-prediction - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Santander-Value-Prediction-Challenge) for [Santander Value Prediction Challenge](https://www.kaggle.com/c/santander-value-prediction-challenge).
      • open-solution-toxic-comments - > source code for [Toxic Comment Classification Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge).
      • TPOT - Tool that automatically creates and optimizes machine learning pipelines using genetic programming. Consider it your personal data science assistant, automating a tedious part of machine learning.
      • KoNLPy - A Python package for Korean natural language processing.
  • R

    • General-Purpose Machine Learning

      • ahaz - ahaz: Regularization for semiparametric additive hazards regression.
      • arules - arules: Mining Association Rules and Frequent Itemsets
      • biglasso - biglasso: Extending Lasso Model Fitting to Big Data in R.
      • bigrf - bigrf: Big Random Forests: Classification and Regression Forests for Large Data Sets.
      • bigRR - bigRR: Generalized Ridge Regression (with special advantage for p >> n cases).
      • bmrm - bmrm: Bundle Methods for Regularized Risk Minimization Package.
      • Boruta - Boruta: A wrapper algorithm for all-relevant feature selection.
      • bst - bst: Gradient Boosting.
      • C50 - C50: C5.0 Decision Trees and Rule-Based Models.
      • caret - Classification and Regression Training: Unified interface to ~150 ML algorithms in R.
      • caretEnsemble - caretEnsemble: Framework for fitting multiple caret models as well as creating ensembles of such models.
      • CORElearn - CORElearn: Classification, regression, feature evaluation and ordinal evaluation.
      • CoxBoost - CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks
      • Cubist - Cubist: Rule- and Instance-Based Regression Modeling.
      • e1071 - e1071: Misc Functions of the Department of Statistics (e1071), TU Wien
      • earth - earth: Multivariate Adaptive Regression Spline Models
      • elasticnet - elasticnet: Elastic-Net for Sparse Estimation and Sparse PCA.
      • ElemStatLearn - ElemStatLearn: Data sets, functions and examples from the book: "The Elements of Statistical Learning, Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman.
      • evtree - evtree: Evolutionary Learning of Globally Optimal Trees.
      • forecast - forecast: Timeseries forecasting using ARIMA, ETS, STLM, TBATS, and neural network models.
      • forecastHybrid - forecastHybrid: Automatic ensemble and cross validation of ARIMA, ETS, STLM, TBATS, and neural network models from the "forecast" package.
      • fpc - fpc: Flexible procedures for clustering.
      • frbs - frbs: Fuzzy Rule-based Systems for Classification and Regression Tasks.
      • GAMBoost - GAMBoost: Generalized linear and additive models by likelihood based boosting.
      • gamboostLSS - gamboostLSS: Boosting Methods for GAMLSS.
      • gbm - gbm: Generalized Boosted Regression Models.
      • glmnet - glmnet: Lasso and elastic-net regularized generalized linear models.
      • glmpath - glmpath: L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model.
      • GMMBoost - GMMBoost: Likelihood-based Boosting for Generalized mixed models.
      • grplasso - grplasso: Fitting user specified models with Group Lasso penalty.
      • grpreg - grpreg: Regularization paths for regression models with grouped covariates.
      • h2o - A framework for fast, parallel, and distributed machine learning algorithms at scale -- Deeplearning, Random forests, GBM, KMeans, PCA, GLM.
      • hda - hda: Heteroscedastic Discriminant Analysis.
      • Introduction to Statistical Learning
      • ipred - ipred: Improved Predictors.
      • kernlab - kernlab: Kernel-based Machine Learning Lab.
      • klaR - klaR: Classification and visualization.
      • lars - lars: Least Angle Regression, Lasso and Forward Stagewise.
      • lasso2 - lasso2: L1 constrained estimation aka ‘lasso’.
      • LiblineaR - LiblineaR: Linear Predictive Models Based On The Liblinear C/C++ Library.
      • LogicReg - LogicReg: Logic Regression.
      • maptree - maptree: Mapping, pruning, and graphing tree models.
      • mboost - mboost: Model-Based Boosting.
      • mlr - mlr: Machine Learning in R.
      • mvpart - mvpart: Multivariate partitioning.
      • ncvreg - ncvreg: Regularization paths for SCAD- and MCP-penalized regression models.
      • nnet - nnet: Feed-forward Neural Networks and Multinomial Log-Linear Models.
      • oblique.tree - oblique.tree: Oblique Trees for Classification Data.
      • pamr - pamr: Pam: prediction analysis for microarrays.
      • party - party: A Laboratory for Recursive Partytioning.
      • partykit - partykit: A Toolkit for Recursive Partytioning.
      • penalized - penalized: L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model.
      • penalizedLDA - penalizedLDA: Penalized classification using Fisher's linear discriminant.
      • penalizedSVM - penalizedSVM: Feature Selection SVM using penalty functions.
      • quantregForest - quantregForest: Quantile Regression Forests.
      • randomForest - randomForest: Breiman and Cutler's random forests for classification and regression.
      • randomForestSRC - randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC).
      • rattle - rattle: Graphical user interface for data mining in R.
      • rda - rda: Shrunken Centroids Regularized Discriminant Analysis.
      • rdetools - rdetools: Relevant Dimension Estimation (RDE) in Feature Spaces.
      • REEMtree - REEMtree: Regression Trees with Random Effects for Longitudinal (Panel) Data.
      • relaxo - relaxo: Relaxed Lasso.
      • rgenoud - rgenoud: R version of GENetic Optimization Using Derivatives
      • rgp - rgp: R genetic programming framework.
      • Rmalschains - Rmalschains: Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in R.
      • rminer - rminer: Simpler use of data mining methods (e.g. NN and SVM) in classification and regression.
      • ROCR - ROCR: Visualizing the performance of scoring classifiers.
      • RoughSets - RoughSets: Data Analysis Using Rough Set and Fuzzy Rough Set Theories.
      • rpart - rpart: Recursive Partitioning and Regression Trees.
      • RPMM - RPMM: Recursively Partitioned Mixture Model.
      • RSNNS - RSNNS: Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS).
      • RWeka - RWeka: R/Weka interface.
      • RXshrink - RXshrink: Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regression.
      • sda - sda: Shrinkage Discriminant Analysis and CAT Score Variable Selection.
      • SDDA - SDDA: Stepwise Diagonal Discriminant Analysis.
      • svmpath - svmpath: svmpath: the SVM Path algorithm.
      • tgp - tgp: Bayesian treed Gaussian process models.
      • tree - tree: Classification and regression trees.
      • varSelRF - varSelRF: Variable selection using random forests.
      • XGBoost.R - R binding for eXtreme Gradient Boosting (Tree) Library.
      • ggplot2 - A data visualization package based on the grammar of graphics.
      • bigrf - bigrf: Big Random Forests: Classification and Regression Forests for Large Data Sets.
      • bigRR - bigRR: Generalized Ridge Regression (with special advantage for p >> n cases).
      • caret - Classification and Regression Training: Unified interface to ~150 ML algorithms in R.
      • Machine Learning For Hackers
      • mvpart - mvpart: Multivariate partitioning.
      • oblique.tree - oblique.tree: Oblique Trees for Classification Data.
      • rgp - rgp: R genetic programming framework.
      • SDDA - SDDA: Stepwise Diagonal Discriminant Analysis.
      • SuperLearner - project.org/web/packages/subsemble/index.html) - Multi-algorithm ensemble learning packages.
      • TDSP-Utilities - Two data science utilities in R from Microsoft: 1) Interactive Data Exploration, Analysis, and Reporting (IDEAR) ; 2) Automated Modeling and Reporting (AMR).
      • Clever Algorithms For Machine Learning
  • Ruby

    • General-Purpose Machine Learning

      • Awesome NLP with Ruby - Curated link list for practical natural language processing in Ruby.
      • Raspel - raspell is an interface binding for ruby.
      • Awesome Machine Learning with Ruby - Curated list of ML related resources for Ruby.
      • ruby-plot - gnuplot wrapper for Ruby, especially for plotting ROC curves into SVG files.
      • scruffy - A beautiful graphing toolkit for Ruby.
      • SciRuby
      • Treat - Text REtrieval and Annotation Toolkit, definitely the most comprehensive toolkit I’ve encountered so far for Ruby.
      • Stemmer - Expose libstemmer_c to Ruby.
      • UEA Stemmer - Ruby port of UEALite Stemmer - a conservative stemmer for search and indexing.
      • Twitter-text-rb - A library that does auto linking and extraction of usernames, lists and hashtags in tweets.
      • Ruby Machine Learning - Some Machine Learning algorithms, implemented in Ruby.
      • Machine Learning Ruby
      • jRuby Mahout - JRuby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby.
      • CardMagic-Classifier - A general classifier module to allow Bayesian and other types of classifications.
      • rb-libsvm - Ruby language bindings for LIBSVM which is a Library for Support Vector Machines.
      • rsruby - Ruby - R bridge.
      • data-visualization-ruby - Source code and supporting content for my Ruby Manor presentation on Data Visualisation with Ruby.
      • plot-rb - A plotting library in Ruby built on top of Vega and D3.
      • SciRuby
      • Glean - A data management tool for humans.
      • Bioruby
      • Arel
      • Big Data For Chimps
      • Listof - Community based data collection, packed in gem. Get list of pretty much anything (stop words, countries, non words) in txt, json or hash. [Demo/Search for a list](http://kevincobain2000.github.io/listof/)
      • scruffy - A beautiful graphing toolkit for Ruby.
      • Twitter-text-rb - A library that does auto linking and extraction of usernames, lists and hashtags in tweets.
      • Ruby Linguistics - Linguistics is a framework for building linguistic utilities for Ruby objects in any language. It includes a generic language-independent front end, a module for mapping language codes into language names, and a module which contains various English-language utilities.
      • Random Forester - Creates Random Forest classifiers from PMML files.
      • Ruby Wordnet - This library is a Ruby interface to WordNet.
  • Rust

    • General-Purpose Machine Learning

      • deeplearn-rs - deeplearn-rs provides simple networks that use matrix multiplication, addition, and ReLU under the MIT license.
      • rustlearn - a machine learning framework featuring logistic regression, support vector machines, decision trees and random forests.
      • rusty-machine - a pure-rust machine learning library.
      • leaf - open source framework for machine intelligence, sharing concepts from TensorFlow and Caffe. Available under the MIT license. [**[Deprecated]**](https://medium.com/@mjhirn/tensorflow-wins-89b78b29aafb#.s0a3uy4cc)
      • RustNN - RustNN is a feedforward neural network library.
      • RusticSOM - A Rust library for Self Organising Maps (SOM).
  • SAS

    • General-Purpose Machine Learning

      • Enterprise Miner - Data mining and machine learning that creates deployable models using a GUI or code.
      • High Performance Text Mining - Text mining using a GUI or code in an MPP environment, including Hadoop.
      • ML_Tables - Concise cheat sheets containing machine learning best practices.
      • Factory Miner - Automatically creates deployable machine learning models across numerous market or customer segments using a GUI.
      • Factory Miner - Automatically creates deployable machine learning models across numerous market or customer segments using a GUI.
      • Text Miner - Text mining using a GUI or code.
      • enlighten-apply - Example code and materials that illustrate applications of SAS machine learning techniques.
      • enlighten-integration - Example code and materials that illustrate techniques for integrating SAS with other analytics technologies in Java, PMML, Python and R.
      • enlighten-deep - Example code and materials that illustrate using neural networks with several hidden layers in SAS.
      • Factory Miner - Automatically creates deployable machine learning models across numerous market or customer segments using a GUI.
      • High Performance Text Mining - Text mining using a GUI or code in an MPP environment, including Hadoop.
      • Visual Data Mining and Machine Learning - Interactive, automated, and programmatic modeling with the latest machine learning algorithms in and end-to-end analytics environment, from data prep to deployment. Free trial available.
      • SAS/STAT - For conducting advanced statistical analysis.
      • University Edition - FREE! Includes all SAS packages necessary for data analysis and visualization, and includes online SAS courses.
      • High Performance Data Mining - Data mining and machine learning that creates deployable models using a GUI or code in an MPP environment, including Hadoop.
      • Contextual Analysis - Add structure to unstructured text using a GUI.
      • Sentiment Analysis - Extract sentiment from text using a GUI.
      • Text Miner - Text mining using a GUI or code.
  • Scala

    • General-Purpose Machine Learning

      • ScalaNLP - ScalaNLP is a suite of machine learning and numerical computing libraries.
      • DeepLearning.scala - Creating statically typed dynamic neural networks from object-oriented & functional programming constructs.
      • ScalaNLP - ScalaNLP is a suite of machine learning and numerical computing libraries.
      • Breeze - Breeze is a numerical processing library for Scala.
      • Chalk - Chalk is a natural language processing library.
      • FACTORIE - FACTORIE is a toolkit for deployable probabilistic modeling, implemented as a software library in Scala. It provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference.
      • Montague - Montague is a semantic parsing library for Scala with an easy-to-use DSL.
      • Scalding - A Scala API for Cascading.
      • Summing Bird - Streaming MapReduce with Scalding and Storm.
      • Algebird - Abstract Algebra for Scala.
      • xerial - Data management utilities for Scala.
      • PredictionIO - PredictionIO, a machine learning server for software developers and data engineers.
      • BIDMat - CPU and GPU-accelerated matrix library intended to support large-scale exploratory data analysis.
      • Spark Notebook - Interactive and Reactive Data Science using Scala and Spark.
      • Conjecture - Scalable Machine Learning in Scalding.
      • brushfire - Distributed decision tree ensemble learning in Scala.
      • ganitha - Scalding powered machine learning.
      • adam - A genomics processing engine and specialized file format built using Apache Avro, Apache Spark and Parquet. Apache 2 licensed.
      • bioscala - Bioinformatics for the Scala programming language
      • BIDMach - CPU and GPU-accelerated Machine Learning Library.
      • H2O Sparkling Water - H2O and Spark interoperability.
      • doddle-model - An in-memory machine learning library built on top of Breeze. It provides immutable objects and exposes its functionality through a scikit-learn-like API.
      • Figaro - a Scala library for constructing probabilistic models.
      • DynaML - Scala Library/REPL for Machine Learning Research.
  • Swift

    • General-Purpose Machine Learning

      • Bender - Fast Neural Networks framework built on top of Metal. Supports TensorFlow models.
      • DeepLearningKit
      • AIToolbox - A toolbox framework of AI modules written in Swift: Graphs/Trees, Linear Regression, Support Vector Machines, Neural Networks, PCA, KMeans, Genetic Algorithms, MDP, Mixture of Gaussians.
      • Swift Brain - The first neural network / machine learning library written in Swift. This is a project for AI algorithms in Swift for iOS and OS X development. This project includes algorithms focused on Bayes theorem, neural networks, SVMs, Matrices, etc...
      • Awesome Core ML Models - A curated list of machine learning models in CoreML format.
      • swix - A bare bones library that
      • MLKit - A simple Machine Learning Framework written in Swift. Currently features Simple Linear Regression, Polynomial Regression, and Ridge Regression.
      • Perfect TensorFlow - Swift Language Bindings of TensorFlow. Using native TensorFlow models on both macOS / Linux.
      • PredictionBuilder - A library for machine learning that builds predictions using a linear regression.
      • BrainCore - The iOS and OS X neural network framework.
  • TensorFlow

    • General-Purpose Machine Learning