An open API service indexing awesome lists of open source software.

awesome-machine-learning

A curated list of awesome machine learning frameworks, libraries and software (by language)
https://github.com/abctechlabs/awesome-machine-learning

Last synced: 2 days ago
JSON representation

  • Python

    • General-Purpose Machine Learning

      • Roboschool - Open-source software for robot simulation, integrated with OpenAI Gym.
      • Retro - Retro Games in Gym
      • SLM Lab - Modular Deep Reinforcement Learning framework in PyTorch.
      • garage - A toolkit for reproducible reinforcement learning research
      • metaworld - An open source robotics benchmark for meta- and multi-task reinforcement learning
      • Maze - Application-oriented deep reinforcement learning framework addressing real-world decision problems.
      • RLlib - RLlib is an industry level, highly scalable RL library for tf and torch, based on Ray. It's used by companies like Amazon and Microsoft to solve real-world decision making problems at scale.
      • DI-engine - DI-engine is a generalized Decision Intelligence engine. It supports most basic deep reinforcement learning (DRL) algorithms, such as DQN, PPO, SAC, and domain-specific algorithms like QMIX in multi-agent RL, GAIL in inverse RL, and RND in exploration problems.
      • BLLIP Parser - Python bindings for the BLLIP Natural Language Parser (also known as the Charniak-Johnson parser). **[Deprecated]**
      • graphlab-create - A library with various machine learning models (regression, clustering, recommender systems, graph analytics, etc.) implemented on top of a disk-backed DataFrame.
      • Coach - Reinforcement Learning Coach by Intel® AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms
      • albumentations - А fast and framework agnostic image augmentation library that implements a diverse set of augmentation techniques. Supports classification, segmentation, detection out of the box. Was used to win a number of Deep Learning competitions at Kaggle, Topcoder and those that were a part of the CVPR workshops.
      • TextBlob - Providing a consistent API for diving into common natural language processing (NLP) tasks. Stands on the giant shoulders of NLTK and Pattern, and plays nicely with both.
      • Pebl - Python Environment for Bayesian Learning. **[Deprecated]**
      • pygal - A Python SVG Charts Creator.
      • Microsoft Recommenders
      • steppy-toolkit - > Curated collection of the neural networks, transformers and models that make your machine learning work faster and more effective.
      • Theano - Optimizing GPU-meta-programming code generating array oriented optimizing math compiler in Python.
      • TensorFlow - Open source software library for numerical computation using data flow graphs.
      • pycascading
      • Superset - A data exploration platform designed to be visual, intuitive, and interactive.
      • imutils - A library containing Convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.
      • DeepPavlov - conversational AI library with many pre-trained Russian NLP models.
      • NuPIC - Numenta Platform for Intelligent Computing.
      • PyTorch Lightning Bolts - Toolbox of models, callbacks, and datasets for AI/ML researchers.
      • JAX - JAX is Autograd and XLA, brought together for high-performance machine learning research.
      • PyGrid - Peer-to-peer network of data owners and data scientists who can collectively train AI models using PySyft
      • FEDOT - modal datasets).
      • bqplot - An API for plotting in Jupyter (IPython).
      • Suiron - Machine Learning for RC Cars.
      • TResNet: High Performance GPU-Dedicated Architecture - TResNet models were designed and optimized to give the best speed-accuracy tradeoff out there on GPUs.
      • open-solution-home-credit - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Home-Credit-Default-Risk) for [Home Credit Default Risk](https://www.kaggle.com/c/home-credit-default-risk).
      • open-solution-googleai-object-detection - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Google-AI-Object-Detection-Challenge) for [Google AI Open Images - Object Detection Track](https://www.kaggle.com/c/google-ai-open-images-object-detection-track).
      • open-solution-ship-detection - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Ships) for [Airbus Ship Detection Challenge](https://www.kaggle.com/c/airbus-ship-detection).
      • open-solution-data-science-bowl-2018 - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Data-Science-Bowl-2018) for [2018 Data Science Bowl](https://www.kaggle.com/c/data-science-bowl-2018).
      • open-solution-value-prediction - > source code and [experiments results](https://app.neptune.ml/neptune-ml/Santander-Value-Prediction-Challenge) for [Santander Value Prediction Challenge](https://www.kaggle.com/c/santander-value-prediction-challenge).
      • open-solution-toxic-comments - > source code for [Toxic Comment Classification Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge).
      • timm - PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more.
      • segmentation_models.pytorch - A PyTorch-based toolkit that offers pre-trained segmentation models for computer vision tasks. It simplifies the development of image segmentation applications by providing a collection of popular architecture implementations, such as UNet and PSPNet, along with pre-trained weights, making it easier for researchers and developers to achieve high-quality pixel-level object segmentation in images.
      • Microsoft ML for Apache Spark - > A distributed machine learning framework Apache Spark
      • sktime - A unified framework for machine learning with time series
      • Intel(R) Extension for Scikit-learn - A seamless way to speed up your Scikit-learn applications with no accuracy loss and code changes.
      • Thampi - Machine Learning Prediction System on AWS Lambda
      • CometLLM - Track, log, visualize and evaluate your LLM prompts and prompt chains.
      • KoNLPy - A Python package for Korean natural language processing.
  • R

    • General-Purpose Machine Learning

      • Optunity - A library dedicated to automated hyperparameter optimization with a simple, lightweight API to facilitate drop-in replacement of grid search. Optunity is written in Python but interfaces seamlessly to R.
      • ahaz - ahaz: Regularization for semiparametric additive hazards regression. **[Deprecated]**
      • arules - arules: Mining Association Rules and Frequent Itemsets
      • biglasso - biglasso: Extending Lasso Model Fitting to Big Data in R.
      • bmrm - bmrm: Bundle Methods for Regularized Risk Minimization Package.
      • Boruta - Boruta: A wrapper algorithm for all-relevant feature selection.
      • bst - bst: Gradient Boosting.
      • C50 - C50: C5.0 Decision Trees and Rule-Based Models.
      • caret - Classification and Regression Training: Unified interface to ~150 ML algorithms in R.
      • caretEnsemble - caretEnsemble: Framework for fitting multiple caret models as well as creating ensembles of such models. **[Deprecated]**
      • Clever Algorithms For Machine Learning
      • CORElearn - CORElearn: Classification, regression, feature evaluation and ordinal evaluation.
      • CoxBoost - CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks **[Deprecated]**
      • Cubist - Cubist: Rule- and Instance-Based Regression Modelling.
      • e1071 - e1071: Misc Functions of the Department of Statistics (e1071), TU Wien
      • earth - earth: Multivariate Adaptive Regression Spline Models
      • elasticnet - elasticnet: Elastic-Net for Sparse Estimation and Sparse PCA.
      • ElemStatLearn - ElemStatLearn: Data sets, functions and examples from the book: "The Elements of Statistical Learning, Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman.
      • evtree - evtree: Evolutionary Learning of Globally Optimal Trees.
      • forecast - forecast: Timeseries forecasting using ARIMA, ETS, STLM, TBATS, and neural network models.
      • forecastHybrid - forecastHybrid: Automatic ensemble and cross validation of ARIMA, ETS, STLM, TBATS, and neural network models from the "forecast" package.
      • fpc - fpc: Flexible procedures for clustering.
      • frbs - frbs: Fuzzy Rule-based Systems for Classification and Regression Tasks. **[Deprecated]**
      • GAMBoost - GAMBoost: Generalized linear and additive models by likelihood based boosting. **[Deprecated]**
      • gamboostLSS - gamboostLSS: Boosting Methods for GAMLSS.
      • gbm - gbm: Generalized Boosted Regression Models.
      • glmnet - glmnet: Lasso and elastic-net regularized generalized linear models.
      • glmpath - glmpath: L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model.
      • GMMBoost - GMMBoost: Likelihood-based Boosting for Generalized mixed models. **[Deprecated]**
      • grplasso - grplasso: Fitting user specified models with Group Lasso penalty.
      • grpreg - grpreg: Regularization paths for regression models with grouped covariates.
      • h2o - A framework for fast, parallel, and distributed machine learning algorithms at scale -- Deeplearning, Random forests, GBM, KMeans, PCA, GLM.
      • hda - hda: Heteroscedastic Discriminant Analysis. **[Deprecated]**
      • Introduction to Statistical Learning
      • ipred - ipred: Improved Predictors.
      • kernlab - kernlab: Kernel-based Machine Learning Lab.
      • klaR - klaR: Classification and visualization.
      • L0Learn - L0Learn: Fast algorithms for best subset selection.
      • lars - lars: Least Angle Regression, Lasso and Forward Stagewise. **[Deprecated]**
      • lasso2 - lasso2: L1 constrained estimation aka ‘lasso’.
      • LiblineaR - LiblineaR: Linear Predictive Models Based On The Liblinear C/C++ Library.
      • LogicReg - LogicReg: Logic Regression.
      • maptree - maptree: Mapping, pruning, and graphing tree models. **[Deprecated]**
      • mboost - mboost: Model-Based Boosting.
      • mlr - mlr: Machine Learning in R.
      • ncvreg - ncvreg: Regularization paths for SCAD- and MCP-penalized regression models.
      • nnet - nnet: Feed-forward Neural Networks and Multinomial Log-Linear Models. **[Deprecated]**
      • pamr - pamr: Pam: prediction analysis for microarrays. **[Deprecated]**
      • party - party: A Laboratory for Recursive Partitioning
      • partykit - partykit: A Toolkit for Recursive Partitioning.
      • penalized - penalized: L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model.
      • penalizedLDA - penalizedLDA: Penalized classification using Fisher's linear discriminant. **[Deprecated]**
      • penalizedSVM - penalizedSVM: Feature Selection SVM using penalty functions.
      • quantregForest - quantregForest: Quantile Regression Forests.
      • randomForest - randomForest: Breiman and Cutler's random forests for classification and regression.
      • randomForestSRC - randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC).
      • rattle - rattle: Graphical user interface for data mining in R.
      • rda - rda: Shrunken Centroids Regularized Discriminant Analysis.
      • rdetools - rdetools: Relevant Dimension Estimation (RDE) in Feature Spaces. **[Deprecated]**
      • REEMtree - REEMtree: Regression Trees with Random Effects for Longitudinal (Panel) Data. **[Deprecated]**
      • relaxo - relaxo: Relaxed Lasso. **[Deprecated]**
      • rgenoud - rgenoud: R version of GENetic Optimization Using Derivatives
      • Rmalschains - Rmalschains: Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in R.
      • rminer - rminer: Simpler use of data mining methods (e.g. NN and SVM) in classification and regression. **[Deprecated]**
      • ROCR - ROCR: Visualizing the performance of scoring classifiers. **[Deprecated]**
      • RoughSets - RoughSets: Data Analysis Using Rough Set and Fuzzy Rough Set Theories. **[Deprecated]**
      • rpart - rpart: Recursive Partitioning and Regression Trees.
      • RPMM - RPMM: Recursively Partitioned Mixture Model.
      • RSNNS - RSNNS: Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS).
      • RWeka - RWeka: R/Weka interface.
      • RXshrink - RXshrink: Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regression.
      • sda - sda: Shrinkage Discriminant Analysis and CAT Score Variable Selection. **[Deprecated]**
      • spectralGraphTopology - spectralGraphTopology: Learning Graphs from Data via Spectral Constraints.
      • svmpath - svmpath: svmpath: the SVM Path algorithm. **[Deprecated]**
      • tgp - tgp: Bayesian treed Gaussian process models. **[Deprecated]**
      • tree - tree: Classification and regression trees.
      • varSelRF - varSelRF: Variable selection using random forests.
      • XGBoost.R - R binding for eXtreme Gradient Boosting (Tree) Library.
      • igraph - binding to igraph library - General purpose graph library.
      • dplyr - A data manipulation package that helps to solve the most common data manipulation problems.
      • ggplot2 - A data visualization package based on the grammar of graphics.
      • tmap
      • tm
      • Machine Learning For Hackers
      • SuperLearner - Multi-algorithm ensemble learning packages.
      • TDSP-Utilities - Two data science utilities in R from Microsoft: 1) Interactive Data Exploration, Analysis, and Reporting (IDEAR) ; 2) Automated Modelling and Reporting (AMR).
      • tm
      • shiny
  • Ruby

    • General-Purpose Machine Learning

      • Awesome NLP with Ruby - Curated link list for practical natural language processing in Ruby.
      • Raspell - raspell is an interface binding for ruby. **[Deprecated]**
      • Twitter-text-rb - A library that does auto linking and extraction of usernames, lists and hashtags in tweets.
      • Awesome Machine Learning with Ruby - Curated list of ML related resources for Ruby.
      • ruby-plot - gnuplot wrapper for Ruby, especially for plotting ROC curves into SVG files. **[Deprecated]**
      • SciRuby
      • Treat - Text Retrieval and Annotation Toolkit, definitely the most comprehensive toolkit I’ve encountered so far for Ruby.
      • Stemmer - Expose libstemmer_c to Ruby. **[Deprecated]**
      • UEA Stemmer - Ruby port of UEALite Stemmer - a conservative stemmer for search and indexing.
      • Ruby Machine Learning - Some Machine Learning algorithms, implemented in Ruby. **[Deprecated]**
      • Machine Learning Ruby
      • jRuby Mahout - JRuby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby. **[Deprecated]**
      • CardMagic-Classifier - A general classifier module to allow Bayesian and other types of classifications.
      • rb-libsvm - Ruby language bindings for LIBSVM which is a Library for Support Vector Machines.
      • Scoruby - Creates Random Forest classifiers from PMML files.
      • rumale - Rumale is a machine learning library in Ruby
      • rsruby - Ruby - R bridge.
      • data-visualization-ruby - Source code and supporting content for my Ruby Manor presentation on Data Visualisation with Ruby. **[Deprecated]**
      • plot-rb - A plotting library in Ruby built on top of Vega and D3. **[Deprecated]**
      • scruffy - A beautiful graphing toolkit for Ruby.
      • SciRuby
      • Glean - A data management tool for humans. **[Deprecated]**
      • Bioruby
      • Arel
      • Big Data For Chimps
      • Listof - Community based data collection, packed in gem. Get list of pretty much anything (stop words, countries, non words) in txt, JSON or hash. [Demo/Search for a list](http://kevincobain2000.github.io/listof/)
  • Rust

    • General-Purpose Machine Learning

      • smartcore - "The Most Advanced Machine Learning Library In Rust."
      • linfa - a comprehensive toolkit to build Machine Learning applications with Rust
      • deeplearn-rs - deeplearn-rs provides simple networks that use matrix multiplication, addition, and ReLU under the MIT license.
      • rustlearn - a machine learning framework featuring logistic regression, support vector machines, decision trees and random forests.
      • rusty-machine - a pure-rust machine learning library.
      • leaf - open source framework for machine intelligence, sharing concepts from TensorFlow and Caffe. Available under the MIT license. [**[Deprecated]**](https://medium.com/@mjhirn/tensorflow-wins-89b78b29aafb#.s0a3uy4cc)
      • RustNN - RustNN is a feedforward neural network library. **[Deprecated]**
      • RusticSOM - A Rust library for Self Organising Maps (SOM).
  • SAS

    • General-Purpose Machine Learning

      • Enterprise Miner - Data mining and machine learning that creates deployable models using a GUI or code.
      • SAS/STAT - For conducting advanced statistical analysis.
      • Text Miner - Text mining using a GUI or code.
      • ML_Tables - Concise cheat sheets containing machine learning best practices.
      • Factory Miner - Automatically creates deployable machine learning models across numerous market or customer segments using a GUI.
      • Visual Data Mining and Machine Learning - Interactive, automated, and programmatic modelling with the latest machine learning algorithms in and end-to-end analytics environment, from data prep to deployment. Free trial available.
      • enlighten-apply - Example code and materials that illustrate applications of SAS machine learning techniques.
      • enlighten-integration - Example code and materials that illustrate techniques for integrating SAS with other analytics technologies in Java, PMML, Python and R.
      • enlighten-deep - Example code and materials that illustrate using neural networks with several hidden layers in SAS.
      • Factory Miner - Automatically creates deployable machine learning models across numerous market or customer segments using a GUI.
      • University Edition - FREE! Includes all SAS packages necessary for data analysis and visualization, and includes online SAS courses.
  • Scala

    • General-Purpose Machine Learning

      • FlinkML in Apache Flink - Distributed machine learning library in Flink.
      • MLlib in Apache Spark - Distributed machine learning library in Spark
      • Smile - Statistical Machine Intelligence and Learning Engine.
      • Flink - Open source platform for distributed stream and batch data processing.
      • ScalaNLP - ScalaNLP is a suite of machine learning and numerical computing libraries.
      • DeepLearning.scala - Creating statically typed dynamic neural networks from object-oriented & functional programming constructs.
      • ScalaNLP - ScalaNLP is a suite of machine learning and numerical computing libraries.
      • Breeze - Breeze is a numerical processing library for Scala.
      • Chalk - Chalk is a natural language processing library. **[Deprecated]**
      • FACTORIE - FACTORIE is a toolkit for deployable probabilistic modelling, implemented as a software library in Scala. It provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference.
      • Montague - Montague is a semantic parsing library for Scala with an easy-to-use DSL.
      • Spark NLP - Natural language processing library built on top of Apache Spark ML to provide simple, performant, and accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment.
      • NDScala - N-dimensional arrays in Scala 3. Think NumPy ndarray, but with compile-time type-checking/inference over shapes, tensor/axis labels & numeric data types
      • Scalding - A Scala API for Cascading.
      • Summing Bird - Streaming MapReduce with Scalding and Storm.
      • Algebird - Abstract Algebra for Scala.
      • xerial - Data management utilities for Scala. **[Deprecated]**
      • PredictionIO - PredictionIO, a machine learning server for software developers and data engineers.
      • BIDMat - CPU and GPU-accelerated matrix library intended to support large-scale exploratory data analysis.
      • Spark Notebook - Interactive and Reactive Data Science using Scala and Spark.
      • ONNX-Scala - An ONNX (Open Neural Network eXchange) API and backend for typeful, functional deep learning in Scala (3).
      • Conjecture - Scalable Machine Learning in Scalding.