An open API service indexing awesome lists of open source software.

awesome-machine-learning

A curated list of awesome machine learning frameworks, libraries and software (by language)
https://github.com/abctechlabs/awesome-machine-learning

Last synced: 1 day ago
JSON representation

  • Scala

    • General-Purpose Machine Learning

      • DeepLearning.scala - Creating statically typed dynamic neural networks from object-oriented & functional programming constructs.
      • ScalaNLP - ScalaNLP is a suite of machine learning and numerical computing libraries.
      • Breeze - Breeze is a numerical processing library for Scala.
      • Chalk - Chalk is a natural language processing library. **[Deprecated]**
      • FACTORIE - FACTORIE is a toolkit for deployable probabilistic modelling, implemented as a software library in Scala. It provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference.
      • Montague - Montague is a semantic parsing library for Scala with an easy-to-use DSL.
      • Spark NLP - Natural language processing library built on top of Apache Spark ML to provide simple, performant, and accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment.
      • NDScala - N-dimensional arrays in Scala 3. Think NumPy ndarray, but with compile-time type-checking/inference over shapes, tensor/axis labels & numeric data types
      • Scalding - A Scala API for Cascading.
      • Summing Bird - Streaming MapReduce with Scalding and Storm.
      • Algebird - Abstract Algebra for Scala.
      • xerial - Data management utilities for Scala. **[Deprecated]**
      • PredictionIO - PredictionIO, a machine learning server for software developers and data engineers.
      • BIDMat - CPU and GPU-accelerated matrix library intended to support large-scale exploratory data analysis.
      • Spark Notebook - Interactive and Reactive Data Science using Scala and Spark.
      • ONNX-Scala - An ONNX (Open Neural Network eXchange) API and backend for typeful, functional deep learning in Scala (3).
      • Conjecture - Scalable Machine Learning in Scalding.
      • brushfire - Distributed decision tree ensemble learning in Scala.
      • ganitha - Scalding powered machine learning. **[Deprecated]**
      • adam - A genomics processing engine and specialized file format built using Apache Avro, Apache Spark and Parquet. Apache 2 licensed.
      • bioscala - Bioinformatics for the Scala programming language
      • BIDMach - CPU and GPU-accelerated Machine Learning Library.
      • H2O Sparkling Water - H2O and Spark interoperability.
      • Saul - Flexible Declarative Learning-Based Programming.
      • doddle-model - An in-memory machine learning library built on top of Breeze. It provides immutable objects and exposes its functionality through a scikit-learn-like API.
      • TensorFlow Scala - Strongly-typed Scala API for TensorFlow.
      • Figaro - a Scala library for constructing probabilistic models.
      • DynaML - Scala Library/REPL for Machine Learning Research.
      • brushfire - Distributed decision tree ensemble learning in Scala.
      • SwiftLearner - Simply written algorithms to help study ML or write your own implementations.
  • Scheme

    • General-Purpose Machine Learning

      • layer - Neural network inference from the command line, implemented in [CHICKEN Scheme](https://www.call-cc.org/).
  • Swift

    • General-Purpose Machine Learning

      • Bender - Fast Neural Networks framework built on top of Metal. Supports TensorFlow models.
      • Swift AI - Highly optimized artificial intelligence and machine learning library written in Swift.
      • AIToolbox - A toolbox framework of AI modules written in Swift: Graphs/Trees, Linear Regression, Support Vector Machines, Neural Networks, PCA, KMeans, Genetic Algorithms, MDP, Mixture of Gaussians.
      • Swift Brain - The first neural network / machine learning library written in Swift. This is a project for AI algorithms in Swift for iOS and OS X development. This project includes algorithms focused on Bayes theorem, neural networks, SVMs, Matrices, etc...
      • Awesome Core ML Models - A curated list of machine learning models in CoreML format.
      • Swift for Tensorflow - a next-generation platform for machine learning, incorporating the latest research across machine learning, compilers, differentiable programming, systems design, and beyond.
      • BrainCore - The iOS and OS X neural network framework.
      • swix - A bare bones library that includes a general matrix language and wraps some OpenCV for iOS development. **[Deprecated]**
      • MLKit - A simple Machine Learning Framework written in Swift. Currently features Simple Linear Regression, Polynomial Regression, and Ridge Regression.
      • Perfect TensorFlow - Swift Language Bindings of TensorFlow. Using native TensorFlow models on both macOS / Linux.
      • PredictionBuilder - A library for machine learning that builds predictions using a linear regression.
      • Awesome CoreML - A curated list of pretrained CoreML models.
  • TensorFlow

    • General-Purpose Machine Learning

      • Awesome TensorFlow - A list of all things related to TensorFlow.
      • Awesome Keras - A curated list of awesome Keras projects, libraries and resources.
      • Golden TensorFlow - A page of content on TensorFlow, including academic papers and links to related topics.
  • Tools

    • General-Purpose Machine Learning

      • Weaviate - technologies/weaviate) vector search engine and vector database. Weaviate uses machine learning to vectorize and store data, and to find answers to natural language queries. With Weaviate you can also bring your custom ML models to production scale.
      • MLReef - MLReef is an end-to-end development platform using the power of git to give structure and deep collaboration possibilities to the ML development process.
      • Pinecone - Vector database for applications that require real-time, scalable vector embedding and similarity search.
      • guild.ai - Tool to log, analyze, compare and "optimize" experiments. It's cross-platform and framework independent, and provided integrated visualizers such as tensorboard.
      • MLFlow - platform to manage the ML lifecycle, including experimentation, reproducibility and deployment. Framework and language agnostic, take a look at all the built-in integrations.
      • MachineLearningWithTensorFlow2ed - a book on general purpose machine learning techniques regression, classification, unsupervised clustering, reinforcement learning, auto encoders, convolutional neural networks, RNNs, LSTMs, using TensorFlow 1.14.1.
      • Pythonizr - An online tool to generate boilerplate machine learning code that uses scikit-learn.
      • Flyte - Flyte makes it easy to create concurrent, scalable, and maintainable workflows for machine learning and data processing.
      • Synthical - AI-powered collaborative research environment. You can use it to get recommendations of articles based on reading history, simplify papers, find out what articles are trending, search articles by meaning (not just keywords), create and share folders of articles, see lists of articles from specific companies and universities, and add highlights.
      • txtai - Build semantic search applications and workflows.
      • ML Workspace - All-in-one web-based IDE for machine learning and data science. The workspace is deployed as a docker container and is preloaded with a variety of popular data science libraries (e.g., Tensorflow, PyTorch) and dev tools (e.g., Jupyter, VS Code).
      • Notebooks - A starter kit for Jupyter notebooks and machine learning. Companion docker images consist of all combinations of python versions, machine learning frameworks (Keras, PyTorch and Tensorflow) and CPU/CUDA versions.
      • DVClive - Python library for experiment metrics logging into simply formatted local files.
      • Kedro - Kedro is a data and development workflow framework that implements best practices for data pipelines with an eye towards productionizing machine learning models.
      • Sacred - Python tool to help you configure, organize, log and reproduce experiments. Like a notebook lab in the context of Chemistry/Biology. The community has built multiple add-ons leveraging the proposed standard.
      • m2cgen - A tool that allows the conversion of ML models into native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart) with zero dependencies.
      • CML - A library for doing continuous integration with ML projects. Use GitHub Actions & GitLab CI to train and evaluate models in production like environments and automatically generate visual reports with metrics and graphs in pull/merge requests. Framework & language agnostic.
      • Pythonizr - An online tool to generate boilerplate machine learning code that uses scikit-learn.
      • MLEM - Version and deploy your ML models following GitOps principles
      • DockerDL - Ready to use deeplearning docker images.
      • Ambrosia - Ambrosia helps you clean up your LLM datasets using _other_ LLMs.
      • milvus - io/milvus) vector database for production AI, written in Go and C++, scalable and blazing fast for billions of embedding vectors.
      • DVC - Data Science Version Control is an open-source version control system for machine learning projects with pipelines support. It makes ML projects reproducible and shareable.
      • VDP - open source visual data ETL to streamline the end-to-end visual data processing pipeline: extract unstructured visual data from pre-built data sources, transform it into analysable structured insights by Vision AI models imported from various ML platforms, and load the insights into warehouses or applications.
      • Chaos Genius - ML powered analytics engine for outlier/anomaly detection and root cause analysis.
      • Qdrant
      • CatalyzeX - Browser extension ([Chrome](https://chrome.google.com/webstore/detail/code-finder-for-research/aikkeehnlfpamidigaffhfmgbkdeheil) and [Firefox](https://addons.mozilla.org/en-US/firefox/addon/code-finder-catalyzex/)) that automatically finds and shows code implementations for machine learning papers anywhere: Google, Twitter, Arxiv, Scholar, etc.
      • Weights & Biases - Machine learning experiment tracking, dataset versioning, hyperparameter search, visualization, and collaboration
      • Aqueduct - Aqueduct enables you to easily define, run, and manage AI & ML tasks on any cloud infrastructure.