awesome-production-machine-learning

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
https://github.com/EthicalML/awesome-production-machine-learning

Last synced: 3 days ago
JSON representation

Agentic Framework
- AutoGen - AutoGen is an open-source framework for building AI agent systems.
- CrewAI - CrewAI is a cutting-edge framework for orchestrating role-playing, autonomous AI agents.
- Agents - Agents allows users to build AI-driven server programs that can see, hear, and speak in realtime.
- AgentScope - AgentScope is a multi-agent platform designed to empower developers to build multi-agent applications with large-scale models.
- Chidori - Chidori is a reactive runtime that supports building robust AI agents using languages like Node.js, Python, and Rust, with a focus on reactivity and observability in agent workflows.
- Modelscope-Agent - agent.svg?style=social) - Modelscope-Agent is a customizable and scalable agent framework.
- OpenAGI - OpenAGI is used as the agent creation package to build agents for AIOS.
- Swarm - Swarm is an educational framework exploring ergonomic, lightweight multi-agent orchestration.
- TensorZero - TensorZero is an open-source framework for building production-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluations, and experimentation.
- Eko - Eko is a production-ready JavaScript framework that enables developers to create reliable agents, from simple commands to complex workflows.
- Composio - Composio equip's your AI agents & LLMs with 100+ high-quality integrations via function calling.
- LangGraph - ai/langgraph.svg?style=social) - LangGraph is a library for building stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows.
- Swarms - Swarms is an enterprise grade and production ready multi-agent collaboration framework that enables you to orchestrate many agents to work collaboratively at scale to automate real-world activities.
- AgentOps - AI/agentops.svg?style=social) - AgentOps helps developers build, evaluate, and monitor AI agents from prototype to production.
- AIOpsLab - AIOpsLab is a holistic framework to enable the design, development, and evaluation of autonomous AIOps agents..
- PydanticAI - ai.svg?style=social) - PydanticAI is a Python agent framework designed to make it less painful to build production grade applications with Generative AI.
- IntellAgent - ai/intellagent.svg?style=social) - IntellAgent is an advanced multi-agent framework that transforms the evaluation and optimization of conversational agents.
- AgentStack - AI/AgentStack.svg?style=social) - AgentStack scaffolds your agent stack.
Data Storage Optimisation
- Casibase - Casibase is a LangChain-like RAG (Retrieval-Augmented Generation) knowledge database with web UI and Enterprise SSO.
- TimescaleDB - source time-series SQL database optimized for fast ingest and complex queries packaged as a PostgreSQL extension - [(Video)](https://www.youtube.com/watch?v=zbjub8BQPyE).
- AIStore - AIStore is a lightweight object storage system with the capability to linearly scale out with each added storage node and a special focus on petascale deep learning.
- Alluxio - A virtual distributed storage system that bridges the gab between computation frameworks and storage systems.
- Apache Arrow - In-memory columnar representation of data compatible with Pandas, Hadoop-based systems, etc..
- Apache Druid - A high performance real-time analytics database. Check this [article](https://towardsdatascience.com/introduction-to-druid-4bf285b92b5a) for introduction.
- Apache Ignite - A memory-centric distributed database, caching, and processing platform for transactional, analytical, and streaming workloads delivering in-memory speeds at petabyte scale - [Demo](https://www.youtube.com/watch?v=Xt4PWQ__YPw).
- Apache Pinot - A realtime distributed OLAP datastore. Comparison of the open source OLAP systems for big data: ClickHouse, Druid, and Pinot is found [here](https://medium.com/@leventov/comparison-of-the-open-source-olap-systems-for-big-data-clickhouse-druid-and-pinot-8e042a5ed1c7).
- ClickHouse - ClickHouse is an open source column oriented database management system.
- Delta Lake - io/delta.svg?style=social) - Delta Lake is a storage layer that brings scalable, ACID transactions to Apache Spark and other big-data engines.
- EdgeDB - NoSQL interface for Postgres that allows for object interaction to data stored.
- GPTCache - GPTCache is a library for creating semantic cache for large language model queries.
- HopsFS - HDFS-compatible file system with scale-out strongly consistent metadata.
- InfluxDB - time analytics.
- Milvus - io/milvus.svg?style=social) Milvus is a cloud-native, open-source vector database built to manage embedding vectors generated by machine learning models and neural networks.
- Marqo - ai/marqo.svg?style=social) Marqo is an end-to-end vector search engine.
- pgvector
- PostgresML
- Safetensors
- Weaviate - A low-latency vector search engine (GraphQL, RESTful) with out-of-the-box support for different media types. Modules include Semantic Search, Q&A, Classification, Customizable Models (PyTorch/TensorFlow/Keras), and more.
- Zarr - developers/zarr-python.svg?style=social) - Python implementation of chunked, compressed, N-dimensional arrays designed for use in parallel computing.
- Apache Hudi - Hudi is a transactional data lake platform that brings core warehouse and database functionality directly to a data lake. Hudi is great for streaming workloads, and also allows creation of efficient incremental batch pipelines. Supports popular query engines including Spark, Flink, Presto, Trino, Hive, etc. More info [here](https://hudi.apache.org/).
- Apache Iceberg - Iceberg is an ACID-compliant, high-performance format built for huge analytic tables (containing tens of petabytes of data), and it brings the reliability and simplicity of SQL tables to big data, while making it possible for engines like Spark, Trino, Flink, Presto, Hive and Impala to safely work with the same tables, at the same time. More info [here](https://iceberg.apache.org/).
- Apache Parquet - java.svg?style=social) - On-disk columnar representation of data compatible with Pandas, Hadoop-based systems, etc..
- Chroma - core/chroma.svg?style=social) - Chroma is an open-source embedding database.
- BayesDB - BayesDB is an AI-native embedding database. A Bayesian database table for querying the probable implications of data as easily as SQL databases query the data itself. - [(Video)](https://www.youtube.com/watch?v=2ws84s6iD1o)
- EdgeDB - Gel supercharges Postgres with a modern data model, graph queries, Auth & AI solutions, and much more.
Model Training Orchestration
- PyCaret - low-code library for training and deploying models (scikit-learn, XGBoost, LightGBM, spaCy)
Adversarial Robustness
- Robust ML - another robustness resource maintained by some of the leading names in adversarial ML. They specifically focus on defenses, and ones that have published code available next to papers. Practical and useful.
- Foolbox - Foolbox is a Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX.
- AdvBox - A toolbox to generate adversarial examples that fool neural networks in PaddlePaddle, PyTorch, Caffe2, MxNet, Keras, TensorFlow, and Advbox can benchmark the robustness of machine learning models.
- AdverTorch - library for adversarial attacks / defenses specifically for PyTorch.
- Artificial Adversary - adversary.svg?style=social) AirBnB's library to generate text that reads the same to a human but passes adversarial classifiers.
- Counterfit - Counterfit is a command-line tool and generic automation layer for assessing the security of machine learning systems.
- Adversarial DNN Playground - Playground.svg?style=social) - think [TensorFlow Playground](https://playground.tensorflow.org), but for Adversarial Examples! A visualization tool designed for learning and teaching - the attack library is limited in size, but it has a nice front-end to it with buttons you can press!
- MIA - epfl/mia.svg?style=social) - A library for running membership inference attacks (MIA) against machine learning models.
- OpenAttack - OpenAttack is a Python-based textual adversarial attack toolkit, which handles the whole process of textual adversarial attacking, including preprocessing text, accessing the victim model, generating adversarial examples and evaluation.
- TextFool - kulynych/textfool.svg?style=social) - plausible looking adversarial examples for text generation.
- Trickster - epfl/trickster.svg?style=social) - Library and experiments for attacking machine learning in discrete domains using graph search.
- Robust ML - another robustness resource maintained by some of the leading names in adversarial ML. They specifically focus on defenses, and ones that have published code available next to papers. Practical and useful.
- Robust ML - another robustness resource maintained by some of the leading names in adversarial ML. They specifically focus on defenses, and ones that have published code available next to papers. Practical and useful.
- Factool - NLP/factool.svg?style=social) - Factool is a tool augmented framework for detecting factual errors of texts generated by large language models.
- Nicolas Carlini’s Adversarial ML reading list - not a library, but a curated list of the most important adversarial papers by one of the leading minds in Adversarial ML, Nicholas Carlini. If you want to discover the 10 papers that matter the most - I would start here.
- Robust ML - another robustness resource maintained by some of the leading names in adversarial ML. They specifically focus on defenses, and ones that have published code available next to papers. Practical and useful.
AutoML
- Colombus - A scalable framework to perform exploratory feature selection implemented in R.
- keras-tuner - team/keras-tuner.svg?style=social) - Keras Tuner is an easy-to-use, distributable hyperparameter optimisation framework that solves the pain points of performing a hyperparameter search. Keras Tuner makes it easy to define a search space and leverage included algorithms to find the best hyperparameter values.
- AutoML-GS - yonder/tsfresh.svg?style=social) - Automatic feature and model search with code generation in Python, on top of common data science libraries (tensorflow, sklearn, etc.).
- auto-sklearn - sklearn.svg?style=social) - Framework to automate algorithm and hyperparameter tuning for sklearn.
- ENAS via Parameter Sharing - Efficient Neural Architecture Search via Parameter Sharing by [authors of paper](https://arxiv.org/abs/1802.03268).
- ENAS-PyTorch - pytorch.svg?style=social) - Efficient Neural Architecture Search (ENAS) in PyTorch based [on this paper](https://arxiv.org/abs/1802.03268).
- ENAS-Tensorflow - Tensorflow.svg?style=social) - Efficient Neural Architecture search via parameter sharing(ENAS) micro search Tensorflow code for windows user.
- Feature Engine - engine/feature_engine.svg?style=social) - Feature-engine is a Python library that contains several transformers to engineer features for use in machine learning models.
- Featuretools - An open source framework for automated feature engineering.
- FLAML - FLAML is a fast library for automated machine learning & tuning.
- go-featureprocessing - featureprocessing.svg?style=social) - A feature pre-processing framework in Go that matches functionality of sklearn.
- HEBO - noah/HEBO.svg?style=social) - Set of open-source hyperparameter optimization frameworks, including the winning submission to the [NeurIPS 2020 Black-Box Optimisation Challenge](https://bbochallenge.com/leaderboard) tested on hyperparameter tuning tasks.
- Katib - A Kubernetes-based system for Hyperparameter Tuning and Neural Architecture Search.
- Maggy - Asynchronous, directed Hyperparameter search and parallel ablation studies on Apache Spark - [(Video)](https://www.youtube.com/watch?v=0Hd1iYEL03w).
- Neural Architecture Search with Controller RNN - architecture-search.svg?style=social) - Basic implementation of Controller RNN from [Neural Architecture Search with Reinforcement Learning](https://arxiv.org/abs/1611.01578) and [Learning Transferable Architectures for Scalable Image Recognition](https://arxiv.org/abs/1707.07012).
- Neural Network Intelligence - NNI (Neural Network Intelligence) is a toolkit to help users run automated machine learning (AutoML) experiments.
- Optuna - Optuna is an automatic hyperparameter optimisation software framework, particularly designed for machine learning.
- OSS Vizier - OSS Vizier is a Python-based service for black-box optimisation and research, one of the first hyperparameter tuning services designed to work at scale.
- sklearn-deap - deap.svg?style=social) Use evolutionary algorithms instead of gridsearch in scikit-learn.
- TPOT - Automation of sklearn pipeline creation (including feature selection, pre-processor, etc.).
- tsfresh - yonder/tsfresh.svg?style=social) - Automatic extraction of relevant features from time series.
- Upgini - Free automated data & feature enrichment library for machine learning: automatically searches through thousands of ready-to-use features from public and community shared data sources and enriches your training dataset with only the accuracy improving features.
- AutoGluon - Automated feature, model, and hyperparameter selection for tabular, image, and text data on top of popular machine learning libraries (Scikit-Learn, LightGBM, CatBoost, PyTorch, MXNet).
- Autokeras - team/autokeras.svg?style=social) - AutoML library for Keras based on ["Auto-Keras: Efficient Neural Architecture Search with Network Morphism"](https://arxiv.org/abs/1806.10282).
- Ax - Ax is an accessible, general-purpose platform for understanding, managing, deploying, and automating adaptive experiments.
- BoTorch - pytorch/botorch.svg?cacheSeconds=86400) - BoTorch is a library for Bayesian Optimization built on PyTorch.
- EvalML - EvalML is an AutoML library which builds, optimizes, and evaluates machine learning pipelines using domain-specific objective functions.
Computation Load Distribution
- Apache Spark MLlib - Apache Spark's scalable machine learning library in Java, Scala, Python and R.
- Bagua - Bagua is a performant and flexible distributed training framework for PyTorch, providing a faster alternative to PyTorch DDP and Horovod. It supports advanced distributed training algorithms such as quantization and decentralization.
- PyWren - Answer the question of the "cloud button" for python function execution. It's a framework that abstracts AWS Lambda to enable data scientists to execute any Python function - [(Video)](https://www.youtube.com/watch?v=OskQytBBdJU).
- Fiber - Distributed computing library for modern computer clusters from Uber.
- TensorFlowOnSpark - TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Model Serialisation
- Java PMML API - Java libraries for consuming and producing PMML files containing models from different frameworks, including:
- PFA - Created by the same organisation as PMML, the Predicted Format for Analytics is an emerging standard for statistical models and data transformation engines.
- PMML - The Predictive Model Markup Language standard in XML - [(Video)](https://www.youtube.com/watch?v=_5pZm2PZ8Q8).
Data Stream Processing
- Apache Spark - Micro-batch processing for streams using the apache spark framework as a backend supporting stateful exactly-once semantics.
- Apache Kafka - Kafka client library for building applications and microservices where the input and output are stored in kafka clusters.
- MosaicML Streaming - Fast, deterministic streaming of large datasets from cloud storage for distributed model training.
- Apache Beam
- Apache Flink - Open source stream processing framework with powerful stream and batch processing capabilities.
- Apache Samza - Distributed stream processing framework. It uses Apache Kafka for messaging, and Apache Hadoop YARN to provide fault tolerance, processor isolation, security, and resource management.
- Apache Spark - Micro-batch processing for streams using the apache spark framework as a backend supporting stateful exactly-once semantics.
- Brooklin - Distributed stream processing framework. It uses Apache Kafka for messaging, and Apache Hadoop YARN to provide fault tolerance, processor isolation, security, and resource management.
- Bytewax - Flexible Python-centric stateful stream processing framework built on top of Rust engine.
- FastStream - A modern broker-agnostic streaming Python framework supporting Apache Kafka, RabbitMQ and NATS protocols, inspired by FastAPI and easily integratable with other web frameworks.
- Faust - Streaming library built on top of Python's Asyncio library using the async kafka client inspired by the kafka streaming library.
- RobustBench - another robustness resource maintained by some of the leading names in adversarial ML. They specifically focus on defenses, and onesa standardized adversarial robustness benchmark.
- TensorStore - Library for reading and writing large multi-dimensional arrays.
- RisingWave - A distributed SQL streaming database that unifies stream processing and low-latency serving, ideal for building and serving features for online machine learning.
- MOA - MOA (Massive Online Analysis) is an open source framework for Big Data stream mining.
Industry Strength NLP
- StableLM - AI/StableLMy.svg?style=social) - Stability AI language models.
- Blackstone - Blackstone is a spaCy model and library for processing long-form, unstructured legal text. Blackstone is an experimental research project from the Incorporated Council of Law Reporting for England and Wales' research lab, ICLR&D.
- Coqui STT - ai/STT.svg?style=social) - Coqui STT is a fast, open-source, multi-platform, deep-learning toolkit for training and deploying speech-to-text models.
- CTRL - A Conditional Transformer Language Model for Controllable Generation released by SalesForce.
- Facebook's XLM - PyTorch original implementation of Cross-lingual Language Model Pretraining which includes BERT, XLM, NMT, XNLI, PKM, etc..
- GluonNLP - nlp.svg?style=social) - GluonNLP is a toolkit that enables easy text preprocessing, datasets loading and neural models building to help you speed up your Natural Language Processing (NLP) research.
- Grover - Grover is a model for Neural Fake News -- both generation and detection. However, it probably can also be used for other generation tasks.
- Kashgari - Kashgari is a simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS), and text classification tasks.
- sense2vec - A Pytorch library that allows for training and using sense2vec models, which are models that leverage the same approach than word2vec, but also leverage part-of-speech attributes for each token, which allows it to be "meaning-aware".
- trlX - trlX is a distributed training framework designed from the ground up to focus on fine-tuning large language models with reinforcement learning using either a provided reward function or a reward-labeled dataset.
Industry Strength RL
- RLlib - project/ray.svg?style=social) - RLlib is an open-source library for reinforcement learning (RL), offering support for production-level, highly distributed RL workloads, while maintaining unified and simple APIs for a large variety of industry applications.
- AI-Optimizer - DRL-LAB/AI-Optimizer.svg?style=social) - AI-Optimizer is a next-generation deep reinforcement learning suit, providing rich algorithm libraries ranging from model-free to model-based RL algorithms, from single-agent to multi-agent algorithms. Moreover, AI-Optimizer contains a flexible and easy-to-use distributed training framework for efficient policy training.
- ALF - ALF is a reinforcement learning framework emphasizing on the flexibility and easiness of implementing complex algorithms involving many different components.
- AlpacaFarm - lab/alpaca_farm.svg?style=social) - AlpacaFarm is a simulation framework for methods that learn from human feedback.
- CityLearn - environments-lab/CityLearn.svg?style=social) - CityLearn is an open source OpenAI Gym environment for the implementation of Multi-Agent Reinforcement Learning (RL) for building energy coordination and demand response in cities.
- DIAMBRA - DIAMBRA Arena is a software package featuring a collection of high-quality environments for Reinforcement Learning research and experimentation.
- garage - garage is a toolkit for developing and evaluating reinforcement learning algorithms, and an accompanying library of state-of-the-art implementations built using that toolkit.
- MALib - marl/malib.svg?style=social) - MALib is a parallel framework of population-based learning nested with reinforcement learning methods. MALib provides higher-level abstractions of MARL training paradigms, which enables efficient code reuse and flexible deployments on different distributed computing paradigms.
- MiniHack - MiniHack is a sandbox framework for easily designing rich and diverse environments for Reinforcement Learning
- RLeXplore - Foundation/RLeXplore.svg?style=social) - RLeXplore provides stable baselines of exploration methods in reinforcement learning.
- RLMeta - RLMeta is a flexible lightweight research framework for Distributed Reinforcement Learning based on PyTorch and moolib
- Safety-Gymnasium - Alignment/safety-gymnasium.svg?style=social) - Safety-Gymnasium is a highly scalable and customizable safe reinforcement learning environment library.
- SuperSuit - Foundation/SuperSuit.svg?style=social) - SuperSuit introduces a collection of small functions which can wrap reinforcement learning environments to do preprocessing ('microwrappers').
Commercial Platform
- Amazon Web Services - AWS (Amazon Web Services) is a comprehensive, evolving cloud computing platform provided by Amazon that includes a mixture of infrastructure-as-a-service (IaaS), platform-as-a-service (PaaS) and packaged-software-as-a-service (SaaS) offerings, including: [Amazon Augmented AI](https://aws.amazon.com/augmented-ai/), [Amazon Rekognition](https://aws.amazon.com/rekognition/), [Amazon SageMaker](https://aws.amazon.com/sagemaker/).
- Anthropic - Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
- Anyscale - Anyscale is a unified compute platform that makes it easy to develop, deploy, and manage scalable AI and Python applications using Ray.
- Apheris - A platform for federated and privacy-preserving data science that lets you securely collaborate on AI with partners without sharing any data.
- Arize - ML observability and automated model monitoring to help ML practitioners understand how their models perform in production, troubleshoot issues, and improve model performance. ML teams can upload offline (training or validation) baselines into an evaluation/inference store alongside online production data for model validation, drift detection, data quality checks, and model performance management.
- Arthur - Authur is a platform that measures, monitors, and improves machine learning models to deliver better results.
- Azure Machine Learning - Azure Machine Learning empowers data scientists and developers to build, deploy, and manage high-quality models faster and with confidence.
- BigML - A consumable, programmable, and scalable Machine Learning platform that makes it easy to solve and automate classification, regression, time series, etc..
- Censius - Censius is an AI Observability Platform that assists enterprises in continuously monitoring, analyzing, and explaining their production models. It combines monitoring, accountability, and explainability into one Observability Platform.
- Comet - Machine learning experiment management. Free for open source and students - [(Video)](https://www.youtube.com/watch?v=xaybRkapeNE).
- Databricks - An integrated end-to-end machine learning environment incorporating managed services for experiment tracking, model training, feature development and management, and feature and model serving.
- Dataiku - Collaborative data science platform powering both self-service analytics and the operationalization of machine learning models in production.
- DataRobot - Automated machine learning platform which enables users to build and deploy machine learning models.
- Datatron - Machine Learning Model Governance Platform for all your AI models in production for large Enterprises.
- Deep Cognition Deep Learning Studio - E2E platform for deep learning.
- deepsense.ai - deepsense.ai helps companies gain competitive advantage by providing customized AI-powered end-to-end solutions, with the main focus on AI software, team augmentation and AI advisory.
- Diffgram - Training Data First platform. Database & Training Data Pipelines for Supervised AI. Integrated with GCP, AWS, Azure and top Annotation Supervision UIs (or use built-in Diffgram UI, or build your own). Plus a growing list of integrated service providers! For Computer Vision, NLP, and Supervised Deep Learning / Machine Learning.
- Fennel - Realtime feature engineering platform for fast moving machine learning teams. Python / Pandas native, built in Rust. Easy to install/use/run, builds upon best practices for reducing data/feature quality issues, and keeps cloud spend low. Fully managed, zero ops.
- Fiddler - Fiddler is a model performance management platform that offers model monitoring, observability, explainability & fairness.
- Gemesys - GEMESYS aims to design a chip that emulates the human brain, overcoming computing bottlenecks and shaping a better future for everyone.
- Graphsignal - Machine learning profiler that helps make model training and inference faster and more efficient.
- H2O Driverless AI - Automates key machine learning tasks, delivering automatic feature engineering, model validation, model tuning, model selection and deployment, machine learning interpretability, bring your own recipe, time-series and automatic pipeline generation for model scoring - [(Video)](https://www.youtube.com/watch?v=ZqCoFp3-rGc).
- Hugging Face - Hugging Face is a platform that allows users to share machine learning models and datasets.
- IBM Watson Studio - Build and scale trusted AI on any cloud. Automate the AI lifecycle for ModelOps.
- InnerEye - InnerEye combines human intelligence with artificial intelligence. By capitalizing on the merging of human neural processing and deep artificial neural networks, InnerEye allows fast and accurate visual inspection, real-time AI training and validation, and establishes a unique human-machine interface for connected user applications.
- Iguazio Data Science Platform - Bring your Data Science to life by automating MLOps with end-to-end machine learning pipelines, transforming AI projects into real-world business outcomes, and supporting real-time performance at enterprise scale.
- Iterative Studio - Seamless data and model management, experiment tracking, visualization and automation, with Git as the single source of truth.
- Lambda Labs - Lambda Labs is a company that provides hardware and software solutions for deep learning applications.
- Katonic.ai - Automate your cycle of Intelligence with Katonic MLOps Platform.
- Kern AI - Kern AI builds the self-service development environment for NLP training data, used by data scientists to quickly build high-quality, large-scale labeled datasets.
- Labelbox - Image labelling service with support for semantic segmentation (brush & superpixels), bounding boxes and nested classifications.
- ModelOp - An enterprise MLOps platform that automates the governance, management and monitoring of deployed AI, ML models across platforms and teams, resulting in reliable, compliant and scalable AI initiatives.
- Modelplace - Modelplace provides a directory of tested and benchmarked AI models from around the world curated by OpenCV.
- MLJAR - Platform for rapid prototyping, developing and deploying machine learning models.
- Nimblebox - A full-stack MLOps platform designed to help data scientists and machine learning practitioners around the world discover, create, and launch multi-cloud apps from their web browser.
- OpenAI - OpenAI aims to promote and develop friendly AI in a way that benefits humanity as a whole.
- Pinecone - Pinecone vector database makes it easy to build high-performance vector search applications
- Prodigy - Prodigy is a scriptable annotation tool so efficient that data scientists can do the annotation themselves, enabling a new level of rapid iteration.
- Replicate - Replicate lets you run machine learning models with a cloud API, without having to understand the intricacies of machine learning or manage your own infrastructure.
- Robust Intelligence - Robust Intelligence is an end-to-end ML integrity solution that proactively eliminates failure at every stage of the model lifecycle. From pre-deployment vulnerability detection and validation to post-deployment monitoring and protection, Robust Intelligence gives teams the confidence to scale models in production across a variety of use cases and modalities.
- SambaNova - SambaNova Systems is a company that specializes in generative AI. They offer a full-stack platform that allows users to build powerful AI models, customized with their data, and owned by them.
- Scale - Scale AI turns raw data into high-quality training data by combining machine learning powered pre-labeling and active tooling with varying levels and types of human review.
- Scribble Enrich - Customizable, auditable, privacy-aware feature store. It is designed to help mid-sized data teams gain trust in the data that they use for training and analysis, and support emerging needs such drift computation and bias assessment.
- SigOpt - SigOpt is a model development platform that makes it easy to track runs, visualize training, and scale hyperparameter optimisation for any type of model built with any library on any infrastructure.
- Skytree - End to end machine learning platform - [(Video)](https://www.youtube.com/watch?v=XuCwpnU-F1k).
- SuperAnnotate - A complete set of solutions for image and video annotation and an annotation service with integrated tooling, on-demand narrow expertise in various fields, and a custom neural network, automation, and training models powered by AI.
- Syndicai - Easy-to-use cloud agnostic platform that deploys, manages, and scales any trained AI model in minutes with no configuration & infrastructure setup.
- Talend Studio - Data integration platform that provides various software and services for data integration, data management, enterprise application integration, data quality, cloud storage and Big Data.
- Tecton - Tecton is an all-in-one system to build, automate, and centralize feature workflows for production ML.
- Valohai - Machine orchestration, version control and pipeline management for deep learning.
- Vertex AI - Vertex AI Workbench is the single environment for data scientists to complete all of their ML work, from experimentation, to deployment, to managing and monitoring models. It is a Jupyter-based fully managed, scalable, enterprise-ready compute infrastructure with security controls and user management capabilities.
- Ultralytics
- WhyLabs - Enable observability to detect data and ML issues faster, deliver continuous improvements, and avoid costly incidents.
- Zilliz - Zilliz builds vector database to accelerate development of next generation data fabric.
- Skymind - Software distribution designed to help enterprise IT teams manage, deploy, and retrain machine learning models at scale.
- Wallaroo.AI - Production AI platform for deploying, managing and observing any model at scale across any enviornment from cloud to edge. Go from python notebook to inferencing in minutes. [Community edition available](https://portal.wallaroo.community/).
- Zeno - Zeno is a platform for evaluating AI systems.
- D2iQ Kaptain - An end-to-end machine learning platform built for security, scale, and speed, that allows enterprises to develop and deploy machine learning models that runs in the cloud, on premises (incl. air-gapped), in hybrid environments, or on the edge; based on Kubeflow and open-source [Kubernetes Universal Declarative Operators](https://kudo.dev) (KUDO).
- OpenAI - OpenAI aims to promote and develop friendly AI in a way that benefits humanity as a whole.
- DAGsHub - Community platform for Open Source ML – Manage experiments, data & models and create collaborative ML projects easily.
Data Science Notebook
- Deepnote - Deepnote is a drop-in replacement for Jupyter with an AI-first design, sleek UI, new blocks, and native data integrations. Use Python, R, and SQL locally in your favorite IDE, then scale to Deepnote cloud for real-time collaboration, Deepnote agent, and deployable data apps.
- Apache Zeppelin - Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
- Marimo - team/marimo.svg?style=social) - Reactive Python notebook — run reproducible experiments, execute as a script, deploy as an app, and version with git.
- Jupyter Notebooks - Web interface python sandbox environments for reproducible development
- .NET Interactive - .NET Interactive takes the power of .NET and embeds it into your interactive experiences.
- Papermill - Papermill is a library for parameterizing notebooks and executing them like Python scripts.
- Polynote - Polynote is an experimental polyglot notebook environment. Currently, it supports Scala and Python (with or without Spark), SQL, and Vega.
- RMarkdown - The rmarkdown package is a next generation implementation of R Markdown based on Pandoc.
- Stencila - Stencila is a platform for creating, collaborating on, and sharing data driven content. Content that is transparent and reproducible.
- Voilà - dashboards/voila.svg?style=social) - Voilà turns Jupyter notebooks into standalone web applications that can e.g. be used as dashboards.
Metadata Management
- Metacat - Metacat is a unified metadata exploration API service. Metacat focuses on solving these problems: 1) federated views of metadata systems; 2) arbitrary metadata storage about data sets; 3) metadata discovery.
- Amundsen - io/amundsen.svg?style=social) - Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
- ML Metadata - metadata.svg?style=social) - a library for recording and retrieving metadata associated with ML developer and data scientist workflows.
- Model Card Toolkit - card-toolkit.svg?style=social) - Model Card Toolkit is a toolkit that streamlines and automates the generation of model cards.
- TensorFlow Metadata - TensorFlow Metadata provides standard representations for metadata that are useful when training machine learning models with TensorFlow.
- Marquez - Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem's metadata.
- Apache Atlas - Apache Atlas framework is an extensible set of core foundational governance services – enabling enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the whole enterprise data ecosystem.
Industry Strength Natural Language Processing
- aisuite - aisuite is a simple, unified interface to multiple generative AI providers.
- CodeTF - CodeTF is a one-stop Python transformer-based library for code large language models (Code LLMs) and code intelligence, provides a seamless interface for training and inferencing on code intelligence tasks like code summarization, translation, code generation and so on.
- SWIFT - swift.svg?style=social) - SWIFT is a scalable lightweight infrastructure for deep learning model fine-tuning.
- h2oGPT - h2oGPT is an open source generative AI, gives organizations like yours the power to own large language models while preserving your data ownership.
- LLaMA-Factory - Factory.svg?style=social) - LLaMA-Factory makes it easy to fine-tunes 100+ large language models with zero-code CLI and Web UI
- Align-Anything - Alignment/align-anything.svg?style=social) - Align-Anything aims to align any modality large models (any-to-any models), including LLMs, VLMs, and others, with human intentions and values
- gpt-fast - pytorch/gpt-fast.svg?cacheSeconds=86400) - Simple and efficient pytorch-native transformer text generation.
- Dify - Dify is an open-source LLM app development platform whose intuitive interface combines agentic AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
- BERTopic - BERTopic is a topic modeling technique that leverages transformers and c-TF-IDF to create dense clusters allowing for easily interpretable topics whilst keeping important words in the topic descriptions.
- dspy - A framework for programming with foundation models.
- Dust - tt/dust.svg?style=social) - Dust assists in the design and deployment of large language model apps.
- ESPnet - ESPnet is an end-to-end speech processing toolkit.
- FastChat - sys/FastChat.svg?style=social) - FastChat is an open platform for training, serving, and evaluating large language model based chatbots.
- Flair - Simple framework for state-of-the-art NLP developed by Zalando which builds directly on PyTorch.
- Gensim - Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora.
- Haystack - ai/haystack.svg?style=social) - Haystack is an open source NLP framework to interact with your data using Transformer models and LLMs (GPT-3 and alike). Haystack offers production-ready tools to quickly build ChatGPT-like question answering, semantic search, text generation, and more.
- Interactive Composition Explorer - ICE is a Python library and trace visualizer for language model programs.
- Lamini - ai/lamini.svg?style=social) - Lamini is an LLM engine for rapidly customizing models.
- LangChain - ai/langchain.svg?style=social) - LangChain assists in building applications with LLMs through composability.
- LlamaIndex - llama/llama_index.svg?style=social) - LlamaIndex (GPT Index) is a data framework for your LLM application.
- LLaMA - llama/llama.svg?style=social) - LLaMA is intended as a minimal, hackable and readable example to load LLaMA (arXiv) models and run inference.
- LLMBox - VLLM/LLaMA2-Accessory.svg?style=social) - LLMBox is a comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation.
- LLaMA2-Accessory - LLaMA2-Accessory is an open-source toolkit for pretraining, finetuning and deployment of Large Language Models (LLMs) and multimodal LLMs.
- LMFlow - LMFlow is an extensible, convenient, and efficient toolbox for finetuning large machine learning models.
- Megatron-LM - LM.svg?style=social) - Megatron-LM is a highly optimized and efficient library for training large language models.
- MindNLP - lab/mindnlp.svg?style=social) - MindNLP is an easy-to-use and high-performance NLP and LLM framework based on MindSpore, compatible with models and datasets of Huggingface.
- MLC LLM - ai/mlc-llm.svg?style=social) - MLC LLM is a universal solution that allows any language models to be deployed natively on a diverse set of hardware backends and native applications, plus a productive framework for everyone to further optimize model performance for their own use cases.
- Ollama - Get up and running with large language models, locally.
- olmOCR - olmOCR is a toolkit for training language models to work with PDF documents in the wild.
- PaddleNLP - PaddleNLP is a Large Language Model (LLM) development suite based on the PaddlePaddle deep learning framework, supporting efficient large model training, lossless compression, and high-performance inference on various hardware devices.
- PyLLMs - PyLLMs is a minimal Python library to connect to various Language Models (LLMs) with a built-in model performance benchmark.
- Semantic Kernel - kernel.svg?style=social) - Semantic Kernel is an SDK that integrates Large Language Models (LLMs) like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java. Semantic Kernel achieves this by allowing you to define plugins that can be chained together in just a few lines of code.
- Sentence Transformers - transformers.svg?style=social) - Sentence Transformers provides an easy method to compute dense vector representations for sentences, paragraphs, and images.
- SpaCy - spaCy is a library for advanced Natural Language Processing in Python and Cython.
- Tensorflow Lingvo - A [framework](https://blog.tensorflow.org/2019/02/lingvo-tensorflow-framework-for-sequence-modeling.html) for building neural networks in Tensorflow, particularly sequence models.
- Tensorflow Text - TensorFlow Text provides a collection of text related classes and ops ready to use with TensorFlow 2.0.
- ToolBench - ToolBench is an open platform for training, serving, and evaluating large language model for tool learning.
- Transformers - Huggingface's library of state-of-the-art pretrained models for Natural Language Processing (NLP).
- Burr - inc/burr.svg?style=social) - Burr helps you develop applications that make decisions (chatbot, agent, simulation). It comes with production-ready features (telemetry, persistence, deployment, etc.) and the open-source, free, and local-first Burr UI.
Deployment and Serving
- OptiLLM - OptiLLM is an OpenAI API-compatible optimizing inference proxy that implements 20+ state-of-the-art techniques to dramatically improve LLM accuracy and performance on reasoning tasks - without requiring any model training or fine-tuning.
- MindsDB - MindsDB is the platform to create, serve, and fine-tune models in real-time from your database, vector store, and application data.
- BentoML - BentoML is an open source framework for high performance ML model serving.
- Backprop - ai/backprop.svg?style=social) - Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
- SkyPilot - org/skypilot.svg?style=social) - SkyPilot is a framework for running LLMs, AI, and batch jobs on any cloud, offering maximum cost savings, highest GPU availability, and managed execution.
- DeepDetect - Machine Learning production server for TensorFlow, XGBoost and Cafe models written in C++ and maintained by Jolibrain.
- Cortex - Cortex is an open source platform for deploying machine learning models—trained with any framework—as production web services. No DevOps required.
- Hydrosphere Serving - serving.svg?style=social) - Hydrosphere Serving is a cluster for deploying and versioning your machine learning models in production.
- Intel® Extension for Transformers - extension-for-transformers.svg?style=social) - An Innovative Transformer-based Toolkit to Accelerate GenAI/LLM Everywhere.
- Redis-AI - A Redis module for serving tensors and executing deep learning models. Expect changes in the API and internals.
- Apache PredictionIO - An open source Machine Learning Server built on top of a state-of-the-art open source stack for developers and data scientists to create predictive engines for any machine learning task.
- OpenScoring - REST web service for the true real-time scoring (< 1 ms) of Scikit-Learn, R and Apache Spark models.
- S-LoRA - LoRA/S-LoRA.svg?style=social) - Serving Thousands of Concurrent LoRA Adapters.
- Mosec - A rust-powered and multi-stage pipelined model server which offers dynamic batching and more. Super easy to implement and deploy as micro-services.
- Seldon Core - core.svg?style=social) - Open source platform for deploying and machine learning models in Kubernetes - [(Video)](https://www.youtube.com/watch?v=pDlapGtecbY).
- OpenVINO - OpenVINO is an open-source toolkit for optimizing and deploying AI inference.
- Tempo - Open source SDK that provides a unified interface to multiple MLOps projects that enable data scientists to deploy and productionise machine learning systems.
- skops - dev/skops.svg?style=social) - skops is a Python library helping you share your scikit-learn based models and put them in production.
- Tensorflow Serving - High-performant framework to serve Tensorflow models via grpc protocol able to handle 100k requests per second per core.
- text-generation-inference - generation-inference.svg?style=social) - Large Language Model Text Generation Inference.
- TorchServe - TorchServe is a flexible and easy to use tool for serving PyTorch models.
- Triton Inference Server - inference-server/server.svg?style=social) - Triton is a high performance open source serving software to deploy AI models from any framework on GPU & CPU while maximizing utilization.
- PowerInfer - IPADS/PowerInfer.svg?style=social) - PowerInfer is a CPU/GPU LLM inference engine leveraging activation locality for your device.
- UnionML - oss/unionml.svg?style=social) - UnionML is an open source MLOps framework that aims to reduce the boilerplate and friction that comes with building models and deploying them to production.
- AirLLM - AirLLM optimizes inference memory usage, allowing 70B large language models to run inference on a single 4GB GPU card without quantization, distillation and pruning.
- Infinity - Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip.
- OpenLLM - OpenLLM allows developers to run any open-source LLMs (Llama 3.1, Qwen2, Phi3 and more) or custom models as OpenAI-compatible APIs with a single command.
- Vercel AI - Vercel AI is a TypeScript toolkit designed to help you build AI-powered applications using popular frameworks like Next.js, React, Svelte, Vue and runtimes like Node.js.
- exo - explore/exo.svg?style=social) - exo helps you run your AI cluster at home with everyday devices.
- KsanaLLM - mlp/KsanaLLM.svg?style=social) - KsanaLLM is a high performance and easy-to-use engine for LLM inference and serving.
- KServe - KServe provides a Kubernetes Custom Resource Definition for serving predictive and generative ML.
- Lepton AI - LeptonAI Python library allows you to build an AI service from Python code with ease.
- Nuclio - A high-performance "serverless" framework focused on data, I/O, and compute-intensive workloads. It is well integrated with popular data science tools, such as Jupyter and Kubeflow; supports a variety of data and streaming sources; and supports execution over CPUs and GPUs.
- Prompt2Model - Prompt2Model is a system that takes a natural language task description (like the prompts used for LLMs such as ChatGPT) to train a small special-purpose model that is conducive for deployment.
- vLLM - project/vllm.svg?style=social) - vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs.
- LightLLM - LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its * [llama.cpp](https://github.com/ggml-org/llama.cpp) ![](https://img.shields.io/github/stars/ggml-org/llama.cpp.svg?style=social) - llama.cpp is an open source software library that performs inference on various large language models such as Llama.
- KTransformers - ai/ktransformers.svg?style=social) - KTransformers is a flexible framework for experiencing cutting-edge LLM inference optimizations.
- Inference - A fast, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models. With Inference, you can deploy models such as YOLOv5, YOLOv8, CLIP, SAM, and CogVLM on your own hardware using Docker.
- LocalAI - LocalAI is a drop-in replacement REST API that's compatible with OpenAI API specifications for local inferencing.
- m2cgen - A lightweight library which allows to transpile trained classic machine learning models into a native code of C, Java, Go, R, PHP, Dart, Haskell, Rust and many other programming languages.
- MLRun - MLRun is an open MLOps framework for quickly building and managing continuous ML and generative AI applications across their lifecycle.
- MLServer - An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more.
- DeepSparse - DeepSparse is a sparsity-aware deep learning inference runtime for CPUs.
- SparseML - SparseML is an open-source model optimization toolkit that enables you to create inference-optimized sparse models using pruning, quantization, and distillation algorithms.
- Open WebUI - webui/open-webui.svg?style=social) - Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with built-in inference engine for RAG, making it a powerful AI deployment solution.
- Dynamo - dynamo/dynamo.svg?style=social) - NVIDIA Dynamo is a high-throughput, low-latency inference framework designed for serving generative AI and reasoning models in multi-node distributed environments.
- SGLang - project/sglang.svg?style=social) - SGLang is a fast serving framework for large language models and vision language models.
- torchtune - pytorch/torchtune.svg?cacheSeconds=86400) - torchtune is a PyTorch library for easily authoring, post-training, and experimenting with LLMs.
- AITemplate - AITemplate (AIT) is a Python framework that transforms deep neural networks into CUDA (NVIDIA GPU) / HIP (AMD GPU) C++ code for lightning-fast inference serving.
- LM Studio - ai/lms.svg?style=social) - LM Studio is a tool for deploying LLM models locally on the computer, even on a relatively modest machine, provided it meets the minimum requirements.
- Agenta - AI/agenta.svg?style=social) - Agenta provides end-to-end tools for the entire LLMOps workflow: building (LLM playground, evaluation), deploying (prompt and configuration management), and (LLM observability and tracing).
- BISHENG - BISHENG is an open LLM application devops platform, focusing on enterprise scenarios.
- Genkit - Genkit is an open source framework for building AI-powered apps with familiar code-centric patterns. Genkit makes it easy to develop, integrate, and test AI features with observability and evaluations.
- IPEX-LLM - llm.svg?style=social) - IPEX-LLM is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low latency.
- Jina-serve - ai/serve.svg?style=social) - Jina-serve is a framework for building and deploying AI services that communicate via gRPC, HTTP and WebSockets.
- Langtrace - Labs/langtrace.svg?style=social) - Langtrace is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorDBs and more.
- LMDeploy - LMDeploy is a toolkit for compressing, deploying, and serving LLM.
- Transformer Lab - app.svg?style=social) - Transformer Lab is an open-source LLM workspace for finetuning, evaluating, exporting, and testing models locally across inference engines and platforms.
- Vespa - engine/vespa.svg?style=social) - Search, make inferences in and organize vectors, tensors, text and structured data, at serving time and any scale.
- Kiln - ai/kiln.svg?style=social) - Kiln is an OSS tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.
- nndeploy - An Easy-to-Use and High-Performance AI deployment framework.
- LiteLLM - LiteLLM is a Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq.
- LitServe - AI/LitServe.svg?cacheSeconds=86400) - LitServe is a flexible serving engine for AI models built on FastAPI. It supports custom inference engines for models, agents, multi-modal systems, RAG, and complex ML pipelines.
- mini-sglang - project/mini-sglang.svg?cacheSeconds=86400) - mini-sglang is a lightweight and efficient serving framework for large language models.
Explainability and Fairness
- AI Explainability 360 - AI/AIX360.svg?style=social) - Interpretability and explainability of data and machine learning models including a comprehensive set of algorithms that cover different dimensions of explanations along with proxy explainability metrics.
- AI Fairness 360 - AI/AIF360.svg?style=social) - A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.
- Alibi - Alibi is an open source Python library aimed at machine learning model inspection and interpretation. The initial focus on the library is on black-box, instance based model explanations.
- anchor - Code for the paper ["High precision model agnostic explanations"](https://homes.cs.washington.edu/~marcotcr/aaai18.pdf), a model-agnostic system that explains the behaviour of complex models with high-precision rules called anchors.
- captum - model interpretability and understanding library for PyTorch developed by Facebook. It contains general purpose implementations of integrated gradients, saliency maps, smoothgrad, vargrad and others for PyTorch models.
- DeepVis Toolbox - visualization-toolbox.svg?style=social) - This is the code required to run the Deep Visualization Toolbox, as well as to generate the neuron-by-neuron visualizations using regularized optimisation. The toolbox and methods are described casually [here](http://yosinski.com/deepvis) and more formally in this [paper](https://arxiv.org/abs/1506.06579).
- FACETS - code/facets.svg?style=social) - Facets contains two robust visualizations to aid in understanding and analyzing machine learning datasets. Get a sense of the shape of each feature of your dataset using Facets Overview, or explore individual observations using Facets Dive.
- Fairlearn - Fairlearn is a python toolkit to assess and mitigate unfairness in machine learning models.
- FairML - FairML is a python toolbox auditing the machine learning models for bias.
- Fairness Comparison - comparison.svg?style=social) - This repository is meant to facilitate the benchmarking of fairness aware machine learning algorithms based on [this paper](https://arxiv.org/abs/1802.04422).
- Fairness Indicators - indicators.svg?style=social) - The tool supports teams in evaluating, improving, and comparing models for fairness concerns in partnership with the broader Tensorflow toolkit.
- iNNvestigate - An open-source library for analyzing Keras models visually by methods such as [DeepTaylor-Decomposition](https://www.sciencedirect.com/science/article/pii/S0031320316303582), [PatternNet](https://openreview.net/forum?id=Hkn7CBaTW), [Saliency Maps](https://arxiv.org/abs/1312.6034), and [Integrated Gradients](https://arxiv.org/abs/1703.01365).
- Integrated-Gradients - Gradients.svg?style=social) - This repository provides code for implementing integrated gradients for networks with image inputs.
- keras-vis - vis.svg?style=social) - keras-vis is a high-level toolkit for visualizing and debugging your trained keras neural net models. Currently supported visualizations include: Activation maximization, Saliency maps, Class activation maps.
- Lightly - ai/lightly.svg?style=social) - A python framework for self-supervised learning on images. The learned representations can be used to analyze the distribution in unlabeled data and rebalance datasets.
- Lightwood - A Pytorch based framework that breaks down machine learning problems into smaller blocks that can be glued together seamlessly with an objective to build predictive models with one line of code.
- ELI5 - Memex/eli5.svg?style=social) - "Explain Like I'm 5" is a Python package which helps to debug machine learning classifiers and explain their predictions.
- themis-ml - ml.svg?style=social) - themis-ml is a Python library built on top of pandas and sklearn that implements fairness-aware machine learning algorithms.
- Themis - UMASS/Themis.svg?style=social) - Themis is a testing-based approach for measuring discrimination in a software system.
- TreeInterpreter - Package for interpreting scikit-learn's decision tree and random forest predictions. Allows decomposing each prediction into bias and feature contribution components as described [here](http://blog.datadive.net/interpreting-random-forests).
- WhatIf - code/what-if-tool.svg?style=social) - An easy-to-use interface for expanding understanding of a black-box classification or regression ML model.
- woe - Tools for WoE Transformation mostly used in ScoreCard Model for credit rating
- mljar-supervised - supervised.svg?style=social) - A Python package for AutoML on tabular data with feature engineering, hyper-parameters tuning, explanations and automatic documentation.
- SHAPash - Shapash is a Python library that provides several types of visualization that display explicit labels that everyone can understand.
- DeepLIFT - Codebase that contains the methods in the paper ["Learning important features through propagating activation differences"](https://arxiv.org/abs/1704.02685). Here is the [slides](https://docs.google.com/file/d/0B15F_QN41VQXSXRFMzgtS01UOU0/edit?filetype=mspresentation) and the [video](https://vimeo.com/238275076) of the 15 minute talk given at ICML.
- LIME - Local Interpretable Model-agnostic Explanations for machine learning models.
- InterpretML - InterpretML is an open-source package for training interpretable models and explaining blackbox systems.
- LOFO Importance - importance.svg?style=social) - LOFO (Leave One Feature Out) Importance calculates the importances of a set of features based on a metric of choice, for a model of choice, by iteratively removing each feature from the set, and evaluating the performance of the model, with a validation scheme of choice, based on the chosen metric.
- Transformer Debugger - debugger.svg?style=social) - Transformer Debugger (TDB) is a tool developed by OpenAI's Superalignment team with the goal of supporting investigations into specific behaviors of small language models.
- SHAP - SHapley Additive exPlanations is a unified approach to explain the output of any machine learning model.
- Aequitas - An open-source bias audit toolkit for data scientists, machine learning researchers, and policymakers to audit machine learning models for discrimination and bias, and to make informed and equitable decisions around developing and deploying predictive risk-assessment tools.
- Quantus - machine-intelligence-lab/Quantus.svg?style=social) - Quantus is an eXplainable AI toolkit for responsible evaluation of neural network explanations
Explaining Black Box Models and Datasets
- casme - Example of using classifier-agnostic saliency map extraction on ImageNet presented on the paper ["Classifier-agnostic saliency map extraction"](https://arxiv.org/abs/1805.08249).
- ContrastiveExplanation (Foil Trees) - Python script for model agnostic contrastive/counterfactual explanations for machine learning. Accompanying code for the paper ["Contrastive Explanations with Local Foil Trees"](https://arxiv.org/abs/1806.07470).
- GEBI - Global Explanations for Bias Identification - An attention-based summarized post-hoc explanations for detection and identification of bias in data. We propose a global explanation and introduce a step-by-step framework on how to detect and test bias. Python package for image data.
- L2X - Lab/L2X.svg?style=social) - Code for replicating the experiments in the paper ["Learning to Explain: An Information-Theoretic Perspective on Model Interpretation"](https://arxiv.org/pdf/1802.07814.pdf) at ICML 2018.
- pyBreakDown - A model agnostic tool for decomposition of predictions from black boxes. Break Down Table shows contributions of every variable to a final prediction.
- responsibly - Toolkit for auditing and mitigating bias and fairness of machine learning systems
Industry Strength Evaluation
- XAI - eXplainableAI - An eXplainability toolbox for machine learning.
- RobotPerf - RobotPerf is an open reference benchmarking suite that is used to evaluate robotics computing performance fairly with ROS 2 as its common baseline so that robotic architects can make informed decisions about the hardware and software components of their robotic systems.
Privacy and Security
- BastionLab - security/bastionlab.svg?style=social) - BastionLab is a framework for confidential data science collaboration. It uses Confidential Computing, Access control data science, and Differential Privacy to enable data scientists to remotely perform data exploration, statistics, and training on confidential data while ensuring maximal privacy for data owners.
- Concrete-ML - ai/concrete-ml.svg?style=social) - Concrete-ML is a Privacy-Preserving Machine Learning (PPML) open-source set of tools built on top of The Concrete Framework by [Zama](https://github.com/zama-ai). It aims to simplify the use of fully homomorphic encryption (FHE) for data scientists to help them automatically turn machine learning models into their homomorphic equivalent.
- Microsoft SEAL - Microsoft SEAL is an easy-to-use open-source (MIT licensed) homomorphic encryption library developed by the Cryptography Research group at Microsoft.
- Rosetta - Foundation/Rosetta.svg?style=social) - A privacy-preserving framework based on TensorFlow with customized backend Operations using Multi-Party Computation (MPC). Rosetta reuses the APIs of TensorFlow and allows to transfer original TensorFlow codes into a privacy-preserving manner with minimal changes.
- Intel Homomorphic Encryption Backend - transformer.svg?style=social) - The Intel HE transformer for nGraph is a Homomorphic Encryption (HE) backend to the Intel nGraph Compiler, Intel's graph compiler for Artificial Neural Networks.
- Fedlearner - Fedlearner is collaborative machine learning framework that enables joint modeling of data distributed between institutions.
- Substra - Substra is an open-source framework for privacy-preserving, traceable and collaborative Machine Learning.
Industry Strength Visualisation
- Netron - Netron is a viewer for neural network, deep learning and machine learning models.
- ydata-profiling - profiling.svg?style=social) - ydata-profiling provides a one-line Exploratory Data Analysis (EDA) experience in a consistent and fast solution.
- Kangas - ml/kangas.svg?style=social) - Kangas is a tool for exploring, analyzing, and visualizing large-scale multimedia data. It provides a straightforward Python API for logging large tables of data, along with an intuitive visual interface for performing complex queries against your dataset.
- Apache ECharts - Apache ECharts is a powerful, interactive charting and data visualization library for browser.
- Bokeh - Bokeh is an interactive visualization library for Python that enables beautiful and meaningful visual presentation of data in modern web browsers.
- Geoplotlib - cuttone/geoplotlib.svg?style=social) - geoplotlib is a python toolbox for visualizing geographical data and making maps.
- ggplot2 - An implementation of the grammar of graphics for R.
- gradio - app/gradio.svg?style=social) - Quickly create and share demos of models - by only writing Python. Debug models interactively in your browser, get feedback from collaborators, and generate public links without deploying anything.
- matplotlib - A Python 2D plotting library which produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms.
- Missingno - missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities that allows you to get a quick visual summary of the completeness (or lack thereof) of your dataset.
- PDPBox - This repository is inspired by ICEbox. The goal is to visualize the impact of certain features towards model prediction for any supervised learning algorithm.
- Perspective
- Pixiedust - PixieDust is a productivity tool for Python or Scala notebooks, which lets a developer encapsulate business logic into something easy for your customers to consume.
- Plotly - An interactive, open source, and browser-based graphing library for Python.
- pygal - pygal is a dynamic SVG charting library written in Python.
- Apache Superset - A modern, enterprise-ready business intelligence web application.
- Redash - Redash is anopen source visualisation framework that is built to allow easy access to big datasets leveraging multiple backends.
- seaborn - Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics.
- Spotlight - Spotlight helps you to identify critical data segments and model failure modes. It enables you to build and maintain reliable machine learning models by curating high-quality datasets.
- Streamlit - Streamlit lets you create apps for your machine learning projects with deceptively simple Python scripts. It supports hot-reloading, so your app updates live as you edit and save your file.
- PyCEbox - Python Individual Conditional Expectation Plot Toolbox.
- tensorboardX - Write TensorBoard events with simple function call.
- TensorBoard - TensorBoard is a visualization toolkit for machine learning experimentation that makes it easy to host, track, and share ML experiments.
- Transformer Explainer - explainer.svg?style=social) - Transformer Explainer is an interactive visualization tool designed to help anyone learn how Transformer-based models like GPT work.
- Vega-Altair - Vega-Altair is a declarative statistical visualization library for Python.
- Data Formulator - formulator.svg?style=social) - Transform data and create rich visualizations iteratively with AI.
- Rerun - io/rerun.svg?cacheSeconds=86400) - Rerun is an open-source SDK for logging, storing, querying, and visualizing multimodal data, designed for robotics, computer vision, and spatial AI.
Model, Data and Experiment Management
- Aim - A super-easy way to record, search and compare AI experiments.
- Dolt - Dolt is a SQL database that you can fork, clone, branch, merge, push and pull just like a git repository.
- TerminusDB - A graph database management system that stores data like git.
- Sacred - Tool to help you configure, organize, log and reproduce machine learning experiments.
- Neptune - ai/neptune-client.svg?style=social) - Neptune is a scalable experiment tracker for teams that train foundation models.
- ClearML - Auto-Magical Experiment Manager & Version Control for AI (previously Trains).
- DVC - DVC (Data Version Control) is a git fork that allows for version management of models.
- HuggingFace Model Downloader - HuggingFace Model Downloader is a utility tool for downloading models and datasets from the HuggingFace website. It offers multithreaded downloading for LFS files and ensures the integrity of downloaded models with SHA256 checksum verification.
- Keepsake - Version control for machine learning.
- KitOps - ai/kitops.svg?style=social) - KitOps is an open and standards-based packaging and versioning system for AI/ML projects that works with all the AI/ML, development, and DevOps tools you are already using.
- lakeFS - Repeatable, atomic and versioned data lake on top of object storage.
- MLflow - Open source platform to manage the ML lifecycle, including experimentation, reproducibility and deployment.
- Polyaxon - A platform for reproducible and scalable machine learning and deep learning on kubernetes - [(Video)](https://www.youtube.com/watch?v=Iexwrka_hys).
- Quilt - Versioning, reproducibility and deployment of data and models.
- Weights & Biases - Weights & Biase is a machine learning experiment tracking, dataset versioning, hyperparameter search, visualization, and collaboration.
- DataHub - project/datahub.svg?style=social) - DataHub is an open-source data catalog for the modern data stack.
Model, Data and Experiment Tracking
- CodaLab - worksheets.svg?style=social) - CodaLab Worksheets is a collaborative platform for reproducible research that allows researchers to run, manage, and share their experiments in the cloud. It helps researchers ensure that their runs are reproducible and consistent.
- Deepkit - ml.svg?style=social) - An open-source platform and cross-platform desktop application to execute, track, and debug modern machine learning experiments.
- Flor - Easy to use logger and automatic version controller made for data scientists who write ML code.
- Guild AI - Open source toolkit that automates and optimizes machine learning experiments.
- Hangar - py.svg?style=social) - Version control for tensor data, git-like semantics on numerical data with high speed and efficiency.
- ModelStore - An open-source Python library that allows you to version, export, and save a machine learning model to your cloud storage provider.
- ormb - Docker for Your ML/DL Models Based on OCI Artifacts.
- Studio - Model management framework which minimizes the overhead involved with scheduling, running, monitoring and managing artifacts of your machine learning experiments.
- AI2 Tango - AI2 Tango replaces messy directories and spreadsheets full of file versions by organizing experiments into discrete steps that can be cached and reused throughout the lifetime of a research project.
- Catalyst - team/catalyst.svg?style=social) - High-level utils for PyTorch DL & RL research. It was developed with a focus on reproducibility, fast experimentation and code/ideas reusing.
- ModelDB - An open-source system to version machine learning models including their ingredients code, data, config, and environment and to track ML metadata across the model lifecycle.
Model Training and Orchestration
- Determined - ai/determined.svg?style=social) - Deep learning training platform with integrated support for distributed training, hyperparameter tuning, and model management (supports Tensorflow and Pytorch).
- envd - Machine learning development environment for data science and AI/ML engineering teams.
- CML - Continuous Machine Learning (CML) is an open-source library for implementing continuous integration & delivery (CI/CD) in machine learning projects.
- Kubeflow - A cloud-native platform for machine learning based on Google’s internal machine learning pipelines.
- Ludwig - ai/ludwig.svg?style=social) - Ludwig is a low-code framework for building custom AI models like LLMs and other deep neural networks.
- Axolotl - ai-cloud/axolotl.svg?style=social) - Axolotl is a tool designed to streamline the fine-tuning of various AI models, offering support for multiple configurations and architectures.
- AutoTrain Advanced - advanced.svg?style=social) - AutoTrain Advanced is a no-code solution that allows you to train machine learning models in just a few clicks.
- PyCaret - low-code library for training and deploying models (scikit-learn, XGBoost, LightGBM, spaCy)
- dstack - dstack is an open-source container orchestrator that simplifies workload orchestration and drives GPU utilization for ML teams.
- Prime - ai/prime.svg?style=social) - Prime is a framework for efficient, globally distributed training of AI models over the internet.
- CoreNet - CoreNet is a deep neural network toolkit that allows researchers and engineers to train standard and novel small and large-scale models for variety of tasks, including foundation models (e.g., CLIP and LLM), object classification, object detection, and semantic segmentation.
- Fire-Flyer File System - ai/3FS.svg?style=social) - The Fire-Flyer File System (3FS) is a high-performance distributed file system designed to address the challenges of AI training and inference workloads. It leverages modern SSDs and RDMA networks to provide a shared storage layer that simplifies development of distributed applications.
- MFTCoder - ai/MFTCoder.svg?style=social) - MFTCoder is an open-source project of CodeFuse for accurate and efficient Multi-task Fine-tuning(MFT) on Large Language Models(LLMs), especially on Code-LLMs(large language model for code tasks).
- MLeap - Standardisation of pipeline and model serialization for Spark, Tensorflow and sklearn.
- Nanotron - Nanotron provides distributed primitives to train a variety of models efficiently using 3D parallelism.
- NeMo - NVIDIA NeMo is a scalable and cloud-native generative AI framework built for researchers and PyTorch developers working on Large Language Models (LLMs), Multimodal Models (MMs), Automatic Speech Recognition (ASR), Text to Speech (TTS), and Computer Vision (CV) domains. It is designed to help you efficiently create, customize, and deploy new generative AI models by leveraging existing code and pre-trained model checkpoints.
- Sematic - ai/sematic.svg?style=social) - Platform to build resource-intensive pipelines with simple Python.
- Skaffold - Skaffold is a command line tool that facilitates continuous development for Kubernetes applications. You can iterate on your application source code locally then deploy to local or remote Kubernetes clusters.
- TFX - Tensorflow Extended (TFX) is a production oriented configuration framework for ML based on TensorFlow, incl. monitoring and model version management.
- unsloth - Fine-tuning & Reinforcement Learning for LLMs. Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
- BindsNET - BindsNET is a spiking neural network simulation library geared towards the development of biologically inspired algorithms for machine learning.
- H2O-3 - 3.svg?style=social) - Fast scalable Machine Learning platform for smarter applications: Deep Learning, Gradient Boosting & XGBoost, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), etc..
- Avalanche - Avalanche is an end-to-end Continual Learning library to provide a shared and collaborative open-source (MIT licensed) codebase for fast prototyping, training and reproducible evaluation of continual learning algorithms.
- Ignite - Ignite is a high-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.
- Fairseq - Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks.
- Hopsworks - Hopsworks is a data-intensive platform for the design and operation of machine learning pipelines.
Training Orchestration
- Open Platform for AI - Platform that provides complete AI model training and resource management capabilities.
- Fabrik - CV/Fabrik.svg?style=social) - Fabrik is an online collaborative platform to build, visualize and train deep learning models via a simple drag-and-drop interface.
- Nos - ai/nos.svg?style=social) - Nos is an open-source platform to efficiently run AI workloads on Kubernetes, increasing GPU utilization and reducing infrastructure and operational costs.
Computation and Communication Optimisation
- NVIDIA TensorRT - TensorRT is a C++ library for high-performance inference on NVIDIA GPUs and deep learning accelerators.
- DLRover - machine-learning/dlrover.svg?style=social) - DLRover makes the distributed training of large AI models easy, stable, fast and green.
- Adapters - hub/adapters.svg?style=social) - Adapters is a unified library for parameter-efficient and modular transfer learning.
- SetFit - SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers.
- Colossal-AI - A unified deep learning system for big model era, which helps users to efficiently and quickly deploy large AI model training and inference.
- DEAP - A novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data structures transparent. It works in perfect harmony with parallelisation mechanisms such as multiprocessing and SCOOP.
- Dask - Distributed parallel processing framework for Pandas and NumPy computations.
- PaddlePaddle - PaddlePaddle is a framework to perform large-scale deep network training, using data sources distributed across hundreds of nodes.
- PyTorch Lightning - AI/pytorch-lightning.svg?style=social) - PyTorch Lightning pretrains, finetunes and deploys AI models on multiple GPUs, TPUs with zero code changes.
- Ray - project/ray.svg?style=social) - Ray is a flexible, high-performance distributed execution framework for machine learning.
- Flashlight - A fast, flexible machine learning library written entirely in C++ from the Facebook AI Research and the creators of Torch, TensorFlow, Eigen and Deep Speech.
- einops - Flexible and powerful tensor operations for readable and reliable code.
- Hivemind - at-home/hivemind.svg?style=social) - Decentralized deep learning in PyTorch.
- Horovod - Uber's distributed training framework for TensorFlow, Keras, and PyTorch.
- LightGBM - LightGBM is a gradient boosting framework that uses tree based learning algorithms.
- veScale - veScale is a PyTorch native LLM training framework.
- Vowpal Wabbit
- Liger Kernel - Kernel.svg?style=social) - Liger Kernel is a collection of Triton kernels designed specifically for LLM training.
- Accelerate - Accelerate abstracts exactly and only the boilerplate code related to multi-GPU/TPU/mixed-precision and leaves the rest of your code unchanged.
- DeepEP - ai/DeepEP.svg?style=social) - DeepEP is a communication library tailored for Mixture-of-Experts (MoE) and expert parallelism (EP). It provides high-throughput and low-latency all-to-all GPU kernels, which are also known as MoE dispatch and combine. The library also supports low-precision operations, including FP8.
- DGL - DGL is an easy-to-use, high performance and scalable Python package for deep learning on graphs.
- FlagGems - FlagGems is a high-performance general operator library implemented in OpenAI Triton. It builds on a collection of backend neutral kernels that aims to accelerate LLM training and inference across diverse hardware platforms.
- PyG - team/pytorch_geometric.svg?style=social) - PyG (PyTorch Geometric) is a library built upon PyTorch to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data.
- Triton - lang/triton.svg?style=social) - Triton is a language and compiler for writing highly efficient custom Deep-Learning primitives. The aim of Triton is to provide an open-source environment to write fast code at higher productivity than CUDA, but also with higher flexibility than other existing DSLs.
- torchdistill - matsubara/torchdistill.svg?style=social) - torchdistill offers various state-of-the-art knowledge distillation methods and enables you to design (new) experiments simply by editing a declarative yaml config file instead of Python code.
- Kompute - nc/lava.svg?style=social) - Blazing fast, lightweight and mobile phone-enabled Vulkan compute framework optimized for advanced GPU data processing usecases.
- BitBLAS - BitBLAS is a library to support mixed-precision BLAS operations on GPUs
- Composer - Composer is a PyTorch library that enables you to train neural networks faster, at lower cost, and to higher accuracy.
- CuDF - Built based on the Apache Arrow columnar memory format, cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data.
- CuML - cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other RAPIDS projects.
- CuPy - An implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it.
- DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
- Flax - A neural network library and ecosystem for JAX designed for flexibility.
- bitsandbytes - foundation/bitsandbytes.svg?style=social) - Bitsandbytes library is a lightweight Python wrapper around CUDA custom functions, in particular 8-bit optimizers, matrix multiplication (LLM.int8()), and 8 & 4-bit quantization functions.
- Optimum - Optimum is an extension of Transformers and Diffusers, providing a set of optimization tools enabling maximum efficiency to train and run models on targeted hardware while keeping things easy to use.
- Jax - ml/jax.svg?style=social) - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more.
- Lava - Lava is an open source framework to develop applications for neuromorphic hardware architectures.
- MLX - explore/mlx.svg?style=social) - MLX is an array framework for machine learning on Apple silicon.
- Modin - project/modin.svg?style=social) - Speed up your Pandas workflows by changing a single line of code.
- Nevergrad - Nevergrad is a gradient-free optimisation platform.
- Norse - Norse aims to exploit the advantages of bio-inspired neural components, which are sparse and event-driven - a fundamental difference from artificial neural networks.
- Numba - A compiler for Python array and numerical functions.
- PEFT - Parameter-Efficient Fine-Tuning (PEFT) methods enable efficient adaptation of pre-trained language models (PLMs) to various downstream applications without fine-tuning all the model's parameters.
- PyTorch - PyTorch is a library to develop and train neural network based deep learning models.
- Sonnet - deepmind/sonnet.svg?style=social) - Sonnet is a library built on top of TensorFlow 2 designed to provide simple, composable abstractions for machine learning research.
- scikit-learn - learn/scikit-learn.svg?style=social) - Scikit-learn is a powerful machine learning library that provides a wide variety of modules for data access, data preparation and statistical model building.
- snnTorch - snnTorch is a deep and online learning library with spiking neural networks.
- TensorFlow - TensorFlow is a leading library designed for developing and deploying state-of-the-art machine learning applications.
- ThunderKittens
- TorchOpt - TorchOpt is an efficient library for differentiable optimization built upon PyTorch.
- Vaex - of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets. Vaex uses memory mapping, zero memory copy policy and lazy computations for best performance (no memory wasted).
- XGBoost - XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable.
- torchkeras - ov-file.svg?style=social) The torchkeras library is a simple tool for training neural network in pytorch jusk in a keras style.
- yellowbrick - yellowbrick is a matplotlib-based model evaluation plots for scikit-learn and other machine learning libraries.
- YDF - decision-forests.svg?style=social) - YDF (Yggdrasil Decision Forests) is a library to train, evaluate, interpret, and serve Random Forest, Gradient Boosted Decision Trees, CART and Isolation forest models.
- GPUStack - GPUStack is an open-source GPU cluster manager for running AI models.
Model Serving and Monitoring
- ForestFlow - Cloud-native machine learning model server.
- MLWatcher - MLWatcher is a python agent that records a large variety of time-serie metrics of your running ML classification algorithm. It enables you to monitor in real time.
- ONNX Runtime - ONNX Runtime is a cross-platform inference and training machine-learning accelerator.
Evaluation and Monitoring
- Deepchecks - Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling you to test your data and models from research to production thoroughly.
- Evidently - Evidently is an open-source framework to evaluate, test and monitor ML and LLM-powered systems.
- Giskard - AI/giskard.svg?style=social) - Giskard is an open-source Python library that automatically detects performance, bias & security issues in AI applications.
- Langfuse - Langfuse is an observability & analytics solution for LLM-based applications.
- mltrace - mltrace is a lightweight, open-source Python tool to get "bolt-on" observability in ML pipelines.
- Helicone - Helicone is the all-in-one, open-source LLM developer platform.
- NannyML - NannyML is a library that allows you to estimate post-deployment model performance (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance.
- Phoenix - ai/phoenix.svg?style=social) - Phoenix is an open-source AI observability platform designed for experimentation, evaluation, and troubleshooting.
- lmms-eval - Lab/lmms-eval.svg?style=social) - lmms-eval is an evaluation framework meticulously crafted for consistent and efficient evaluation of LMM.
- ARES - futuredata/ARES.svg?style=social) - ARES is a framework for automatically evaluating Retrieval-Augmented Generation (RAG) models.
- Tonic Validate - Tonic Validate is a high-performance evaluation framework for LLM/RAG outputs.
- TruLens - TruLens provides a set of tools for evaluating and tracking LLM experiments.
- HELM - crfm/helm.svg?style=social) - HELM (Holistic Evaluation of Language Models) provides tools for the holistic evaluation of language models, including standardized datasets, a unified API for various models, diverse metrics, r, and fairness perturbations, a prompt construction framework, and a proxy server for unified model access.
- RAGChecker - science/RAGChecker.svg?style=social) - RAGChecker is an advanced automatic evaluation framework designed to assess and diagnose Retrieval-Augmented Generation (RAG) systems.
- AlpacaEval - lab/alpaca_eval.svg?style=social) - AlpacaEval is an automatic evaluator for instruction-following language models.
- Banana-lyzer - Banana-lyzer is an open-source AI Agent evaluation framework and dataset for web tasks with Playwright.
- Code Generation LM Evaluation Harness - project/bigcode-evaluation-harness.svg?style=social) - Code Generation LM Evaluation Harness is a framework for the evaluation of code generation models.
- DeepEval - ai/deepeval.svg?style=social) - DeepEval is a simple-to-use, open-source evaluation framework for LLM applications.
- EvalAI - CV/EvalAI.svg?style=social) - EvalAI is an open-source platform for evaluating and comparing AI algorithms at scale.
- Evals - Evals is a framework for evaluating OpenAI models and an open-source registry of benchmarks.
- EvalScope - EvalScope is a streamlined and customizable framework for efficient large model evaluation and performance benchmarking.
- Evaluate - Evaluate is a library that makes evaluating and comparing models and reporting their performance easier and more standardized.
- Optimum-Benchmark - benchmark.svg?style=social) - A unified multi-backend utility for benchmarking Transformers and Diffusers with support for Optimum's arsenal of hardware optimizations/quantization schemes.
- FMBench - samples/foundation-model-benchmarking-tool.svg?style=social) - FMBench is a tool for running performance benchmarks for any Foundation Model (FM) deployed on any AWS Generative AI service, be it Amazon SageMaker, Amazon Bedrock, Amazon EKS, or Amazon EC2.
- Evalverse - Evalverse is a framework to effortlessly evaluate and report LLMs with no-code requests and comprehensive reports.
- HarmBench - HarmBench is a fast and scalable framework for evaluating automated red teaming methods and LLM attacks/defenses.
- Inspect - Inspect is a framework for large language model evaluations.
- LLM AutoEval - autoeval.svg?style=social) - LLM AutoEval simplifies the process of evaluating LLMs using a convenient Colab notebook.
- Language Model Evaluation Harness - evaluation-harness.svg?style=social) - Language Model Evaluation Harness is a framework to test generative language models on a large number of different evaluation tasks.
- LightEval - LightEval is a lightweight LLM evaluation suite.
- InterCode - nlp/intercode.svg?style=social) - InterCode is a lightweight, flexible, and easy-to-use framework for designing interactive code environments to evaluate language agents that can code.
- MTEB - benchmark/mteb.svg?style=social) - Massive Text Embedding Benchmark (MTEB) is a comprehensive benchmark of text embeddings.
- LLMPerf - project/llmperf.svg?style=social) - LLMPerf is a tool for evaluating the performance of LLM APIs.
- OLMo-Eval - Eval.svg?style=social) - OLMo-Eval is an evaluation suite for evaluating open language models.
- OpenCompass - compass/OpenCompass.svg?style=social) - OpenCompass is an LLM evaluation platform, supporting a wide range of models (LLaMA, LLaMa2, ChatGLM2, ChatGPT, Claude, etc) over 50+ datasets.
- PhaseLLM - PhaseLLM is a large language model evaluation and workflow framework.
- PromptBench - PromptBench is a unified evaluation framework for large language models
- RewardBench - bench.svg?style=social) - RewardBench is a benchmark designed to evaluate the capabilities and safety of reward models.
- TrustLLM - TrustLLM is a comprehensive framework to evaluate the trustworthiness of large language models, which includes principles, surveys, and benchmarks.
- UpTrain - ai/uptrain.svg?style=social) - UpTrain is an open-source tool for evaluating LLM applications.
- Ragas - Ragas is a framework to evaluate RAG pipelines.
- Rageval - community/rageval.svg?style=social) - Rageval is a tool to evaluate RAG system.
- VLMEvalKit - compass/VLMEvalKit.svg?style=social) - VLMEvalKit is an open-source evaluation toolkit of large vision-language models (LVLMs).
- VBench - VBench is a comprehensive benchmark suite for video generative models.
- LLMonitor - ai/lunary.svg?style=social) - LLMonitor is an observability & analytics for AI apps and agents.
- Opik - ml/opik.svg?style=social) - Opik is an open-source platform for evaluating, testing and monitoring LLM applications.
- RAGChecker - science/RAGChecker.svg?style=social) - RAGChecker is an advanced automatic evaluation framework designed to assess and diagnose Retrieval-Augmented Generation (RAG) systems.
- RefChecker - science/RefChecker.svg?style=social) - RefChecker provides a standardized assessment framework to identify subtle hallucinations present in the outputs of large language models (LLMs).
- TensorFlow Model Analysis - analysis.svg?style=social) - TensorFlow Model Analysis (TFMA) is a library for evaluating TensorFlow models on large amounts of data in a distributed manner, using the same metrics defined in their trainer.
- FlagEval - baai/FlagEval.svg?style=social) - FlagEval is an open-source evaluation toolkit as well as an open platform for evaluation of large models.
- continuous-eval - ai/continuous-eval.svg?style=social) - continuous-eval is a framework for data-driven evaluation of LLM-powered applications.
- LangTest - LangTest is a comprehensive evaluation toolkit for NLP models.
- AMLB - AMLB is a framework for evaluating and comparing open-source AutoML systems.
- simple-evals - evals.svg?style=social) - simple-evals is a lightweight library for evaluating language models.
- ANN-Benchmarks - benchmarks.svg?style=social) - ANN-Benchmarks is a benchmarking environment for approximate nearest neighbor algorithms search.
- BEIR - cellar/beir.svg?style=social) - BEIR is a heterogeneous benchmark containing diverse IR tasks. It also provides a common and easy framework for evaluation of your NLP-based retrieval models within the benchmark.
- COMET - COMET is an open-source framework for machine learning evaluation.
- EvalPlus - EvalPlus is a robust evaluation framework for LLM4Code, featuring expanded HumanEval+ and MBPP+ benchmarks, efficiency assessment (EvalPerf), and a secure, extensible evaluation toolkit.
- GAOKAO-Bench - Bench.svg?style=social) - GAOKAO-Bench is an evaluation framework that uses Chinese National College Entrance Examination (GAOKAO) questions as a dataset to assess large models' language comprehension and logical reasoning abilities.
- HumanEval - eval.svg?style=social) - HumanEval is a benchmark for evaluating the functional correctness of code generation models using Python programming problems with unit tests.
- JiWER - JiWER is a simple and fast python package to evaluate an automatic speech recognition system.
- Laminar - ai/lmnr.svg?style=social) - Laminar is an open-source platform to trace, evaluate, label, and analyze LLM data for AI products.
- LangWatch - LangWatch is a visual interface for DSPy and a complete LLM Ops platform for monitoring, experimenting, measuring and improving LLM pipelines, with a fair-code distribution model.
- Meta-World - Foundation/Metaworld.svg?style=social) - Meta-World is an open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic manipulation tasks.
- mir_eval - evaluation/mir_eval.svg?style=social) - mir_eval is a Python library which provides a transparent, standardized, and straightforward way to evaluate Music Information Retrieval systems.
- OGB - stanford/ogb.svg?style=social) - The Open Graph Benchmark (OGB) is a collection of benchmark datasets, data loaders, and evaluators for graph machine learning.
- Ollama Grid Search - grid-search.svg?style=social) - Ollama Grid Search automates the process of selecting the best models, prompts, or inference parameters for a given use-case, allowing you to iterate over their combinations and to visually inspect the results.
- OpenLIT - OpenLIT is an open-source AI engineering platform that simplifies LLM workflows with observability, monitoring, guardrails, evaluations, and seamless integrations.
- Overcooked-AI - Overcooked-AI is a benchmark environment for fully cooperative human-AI task performance, based on the wildly popular video game Overcooked.
- Prometheus-Eval - eval/prometheus-eval.svg?style=social) - RagaAI Catalyst is a comprehensive platform designed to enhance the management and optimization of LLM projects.
- RagaAI Catalyst - ai-hub/RagaAI-Catalyst.svg?style=social) - Prometheus-Eval is a collection of tools for training, evaluating, and using language models specialized in evaluating other language models.
- RLBench - RLBench is an ambitious large-scale benchmark and learning environment designed to facilitate research in a number of vision-guided manipulation research areas, including: reinforcement learning, imitation learning, multi-task learning, geometric computer vision, and in particular, few-shot learning.
- SimplerEnv - env/SimplerEnv.svg?style=social) - SimplerEnv is a simulated manipulation policy evaluation environments for real robot setups.
- SwanLab - SwanLab is an AI training tracking and visualization tool.
- Speech-to-Text Benchmark - to-text-benchmark.svg?style=social) - Speech-to-Text Benchmark is a minimalist and extensible framework for benchmarking different speech-to-text engines.
- TorchBench - TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.
- DomainBed - DomainBed is a test suite containing benchmark datasets and algorithms for domain generalization
- OpenLLMetry - OpenLLMetry provides developers with deep visibility into Large Language Model applications through performance monitoring, execution tracing, and debugging capabilities.
- Massive Text Embedding Benchmark - Massive Text Embedding Benchmark (MTEB) is a comprehensive evaluation framework that assesses the performance of text embedding models across diverse tasks and languages, encompassing 8 embedding tasks, 58 datasets, and 112 languages.
- C-Eval - nlp/ceval.svg?style=social) - C-Eval is a comprehensive Chinese evaluation suite for foundation models.
- Evalchemy - Evalchemy is a unified and easy-to-use toolkit for evaluating post-trained language models.
- Promptfoo - Promptfoo is a developer-friendly local tool for testing LLM applications.
- guidellm - project/guidellm.svg?cacheSeconds=86400) - guidellm is a benchmarking and performance evaluation tool for large language model inference systems.
- LangTest - LangTest is a comprehensive evaluation toolkit for NLP models.
Privacy and Safety
- NeMo Guardrails - Guardrails.svg?style=social) - NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
- DeepTeam - ai/deepteam.svg?style=social) - DeepTeam is a simple-to-use, open-source LLM red teaming framework, for penetration testing and safe guarding large-language model systems.
- Opacus - pytorch/opacus.svg?cacheSeconds=86400) - Opacus is a library that enables training PyTorch models with differential privacy. It supports training with minimal code changes required on the client, has little impact on training performance, and allows the client to online track the privacy budget expended at any given moment.
- ART - AI/adversarial-robustness-toolbox.svg?style=social) - ART (Adversarial Robustness Toolbox) provides tools that enable developers and researchers to defend and evaluate Machine Learning models and applications against the adversarial threats of Evasion, Poisoning, Extraction, and Inference.
- CipherChat - CipherChat is a framework to evaluate the generalization capability of safety alignment for LLMs
- FedML - AI/FedML.svg?style=social) - FedML provides a research and production integrated edge-cloud platform for Federated/Distributed Machine Learning at anywhere at any scale.
- FATE - FATE (Federated AI Technology Enabler) is the world's first industrial grade federated learning open source framework to enable enterprises and institutions to collaborate on data while protecting data security and privacy.
- Flower - Flower is a Federated Learning Framework with a unified approach. It enables the federation of any ML workload, with any ML framework, and any programming language.
- Guardrails - ai/guardrails.svg?style=social) - Guardrails is a package that lets a user add structure, type and quality guarantees to the outputs of large language models.
- OpenFL - OpenFL is a Python framework for Federated Learning. OpenFL is designed to be a _flexible_, _extensible_ and _easily learnable_ tool for data scientists. OpenFL is developed by Intel Internet of Things Group (IOTG) and Intel Labs.
- Tensorflow Privacy - A Python library that includes implementations of TensorFlow optimizers for training machine learning models with differential privacy.
- TF Encrypted - encrypted/tf-encrypted.svg?style=social) - A Framework for Confidential Machine Learning on Encrypted Data in TensorFlow.
- Google's Differential Privacy - privacy.svg?style=social) - This is a C++ library of ε-differentially private algorithms, which can be used to produce aggregate statistics over numeric data sets containing private or sensitive information.
- PySyft - A Python library for secure, private Deep Learning. PySyft decouples private data from model training, using Multi-Party (MPC) within PyTorch.
- Awesome Production GenAI - Focuses specifically on generative AI deployment, including LLM operations, prompt engineering, and GenAI-specific monitoring and safety tools.
- Awesome AI Regulation - Covers governance, compliance, and regulatory frameworks essential for responsible ML system deployment across different jurisdictions.
- AI Gateway - ai/gateway.svg?style=social) - The AI Gateway is a blazing fast AI Gateway with integrated guardrails.
Data Pipeline
- Apache Airflow - Data Pipeline framework built in Python, including scheduler, DAG definition and a UI for visualisation.
- Apache Nifi - Apache NiFi was made for dataflow. It supports highly configurable directed graphs of data routing, transformation, and system mediation logic.
- Argo Workflows - workflows.svg?style=social) - Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes. Argo Workflows is implemented as a Kubernetes CRD (Custom Resource Definition).
- Azkaban - Azkaban is a batch workflow job scheduler created at LinkedIn to run Hadoop jobs. Azkaban resolves the ordering through job dependencies and provides an easy to use web user interface to maintain and track your workflows.
- Basin - etl/basin.svg?style=social) - Visual programming editor for building Spark and PySpark pipelines.
- BatchFlow - BatchFlow helps data scientists conveniently work with random or sequential batches of your data and define data processing and machine learning workflows for large datasets.
- Bonobo - bonobo/bonobo.svg?style=social) - ETL framework for Python 3.5+ with focus on simple atomic operations working concurrently on rows of data.
- Chronos - More of a job scheduler for Mesos than ETL pipeline.
- Couler - proj/couler.svg?style=social) - Unified interface for constructing and managing machine learning workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.
- DataTrove - DataTrove is a library to process, filter and deduplicate text data at a very large scale.
- D6tflow - A python library that allows for building complex data science workflows on Python.
- Dagster - io/dagster.svg?style=social) - A data orchestrator for machine learning, analytics, and ETL.
- DBND - ai/dbnd.svg?style=social) - DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.
- DBT - labs/dbt-core.svg?style=social) - ETL tool for running transformations inside data warehouses.
- Genie - Job orchestration engine to interface and trigger the execution of jobs from Hadoop-based systems.
- Gokart - Wrapper of the data pipeline Luigi.
- Apache Oozie - Workflow scheduler for Hadoop jobs.
- Luigi - Luigi is a Python module that helps you build complex pipelines of batch jobs, handling dependency resolution, workflow management, visualisation, etc..
- Neuraxle - A framework for building neat pipelines, providing the right abstractions to chain your data transformation and prediction steps with data streaming, as well as doing hyperparameter searches (AutoML).
- Pachyderm - Open source distributed processing framework build on Kubernetes focused mainly on dynamic building of production machine learning pipelines - [(Video)](https://www.youtube.com/watch?v=LamKVhe2RSM).
- PipelineX - Based on Kedro and MLflow. Full comparison is found [here](https://github.com/Minyus/Python_Packages_for_Pipeline_Workflow).
- Ploomber - The fastest way to build data pipelines. Develop iteratively, deploy anywhere.
- Prefect Core - Workflow management system that makes it easy to take your data pipelines and add semantics like retries, logging, dynamic mapping, caching, failure notifications, and more.
- Snakemake - Workflow management system for reproducible and scalable data analyses.
- Towhee - io/towhee.svg?style=social) - General-purpose machine learning pipeline for generating embedding vectors using one or many ML models.
- ZenML - io/zenml.svg?style=social) - ZenML is an extensible, open-source MLOps framework to create reproducible ML pipelines with a focus on automated metadata tracking, caching, and many integrations to other tools.
- Sycamore - ai/sycamore.svg?style=social) - Sycamore is an open source, AI-powered document processing engine for ETL, RAG, LLM-based applications, and analytics on unstructured data.
- unstructured - IO/unstructured.svg?style=social) - unstructured streamlines and optimizes the data processing workflow for LLMs, ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more.
- Instill VDP - ai/instill-core.svg?style=social) - Instill VDP (Versatile Data Pipeline) aims to streamline the data processing pipelines from inception to completion.
- Kedro - org/kedro.svg?style=social) - Kedro is a workflow development tool that helps you build data pipelines that are robust, scalable, deployable, reproducible and versioned.
- Metaflow - A framework for data scientists to easily build and manage real-life data science projects.
- DALL·E Flow - ai/dalle-flow.svg?style=social) - DALL·E Flow is an interactive workflow for generating high-definition images from text prompt.
- Instructor - ai/instructor.svg?style=social) - Instructor makes it easy to get structured data like JSON from LLMs like GPT-3.5, GPT-4, GPT-4-Vision, and open-source models.
- SeqIO - SeqIO is a library for processing sequential data to be fed into downstream sequence models.
- Flyte - Lyft’s Cloud Native Machine Learning and Data Processing Platform - [(Demo)](https://youtu.be/KdUJGSP1h9U?t=1451).
- Pixeltable - source Python library providing declarative, incremental data infrastructure for building and managing multimodal AI workloads.
Data Annotation and Synthesis
- cleanlab - Python library for data-centric AI. Can automatically: find mislabeled data, detect outliers, estimate consensus + annotator-quality for multi-annotator datasets, suggest which data is best to (re)label next.
- COCO Annotator - annotator.svg?style=social) - Web-based image segmentation tool for object detection, localization and keypoints
- refinery - kern-ai/refinery.svg?style=social) - The data scientist's open-source choice to scale, assess and maintain natural language data.
- SDV - dev/SDV.svg?style=social) - Synthetic Data Vault (SDV) is a Synthetic Data Generation ecosystem of libraries that allows users to easily learn single-table, multi-table and timeseries datasets to later on generate new Synthetic Data that has the same format and statistical properties as the original dataset.
- Doccano - Open source text annotation tools for humans, providing functionality for sentiment analysis, named entity recognition, and machine translation.
- Label Studio - studio.svg?style=social) - Multi-domain data labeling and annotation tool with standardized output format.
- Argilla - io/argilla.svg?style=social) - Argilla helps domain experts and data teams to build better NLP datasets in less time.
- ViPE - tlabs/vipe.svg?style=social) - ViPE is a spatial AI tool for annotating camera poses and dense depth maps from raw videos.
- CVAT - ai/cvat.svg?style=social) - CVAT (Computer Vision Annotation Tool) is OpenCV's web-based annotation tool for both videos and images for computer algorithms.
- Gretel Synthetics - synthetics.svg?style=social) - Gretel Synthetics is a synthetic data generators for structured and unstructured text, featuring differentially private learning.
- NeMo Curator - Curator.svg?style=social) - NeMo Curator is a GPU-accelerated framework for efficient large language model data curation.
- synthcity - synthcity is a library for generating and evaluating synthetic tabular data.
- YData Synthetic - synthetic.svg?style=social) - YData Synthetic is a package to generate synthetic tabular and time-series data leveraging the state of the art generative models.
- Semantic Segmentation Editor - Automotive-And-Industry-Lab/semantic-segmentation-editor.svg?style=social) - Hitachi's Open source tool for labelling camera and LIDAR data.
- NeMo Curator - Curator.svg?style=social) - NeMo Curator is a GPU-accelerated framework for efficient large language model data curation.
Data Labelling and Synthesis
- Baal - org/baal.svg?style=social) - Baal is an active learning library that supports both industrial applications and research usecases.
- brat rapid annotation tool - Web-based text annotation tool for Named-Entity-Recogntion task.
- ImageTagger - bots/imagetagger.svg?style=social) - Image labelling tool with support for collaboration, supporting bounding box, polygon, line, point labelling, label export, etc.
- ImgLab - Image annotation tool for bounding boxes with auto-suggestion and extensibility for plugins.
- makesense.ai - sense.svg?style=social) - Free to use online tool for labelling photos. Prepared labels can be downloaded in one of multiple supported formats.
- MedTagger - A collaborative framework for annotating medical datasets using crowdsourcing.
- modAL - python/modAL.svg?style=social) - modAL is an active learning framework designed with modularity, flexibility and extensibility in mind.
- OpenLabeling - Open source tool for labelling images with support for labels, edges, as well as image resizing and zooming in.
- PixelAnnotationTool - Image annotation tool with ability to "colour" on the images to select labels for segmentation. Process is semi-automated with the [watershed marked algorithm of OpenCV](docs.opencv.org/3.1.0/d7/d1b/group__imgproc__misc.html#ga3267243e4d3f95165d55a618c65ac6e1)
- Snorkel - team/snorkel.svg?style=social) - Snorkel is a system for quickly generating training data with weak supervision.
- Superintendent - superintendent provides an ipywidget-based interactive labelling tool for your data.
- Synthetic Data SDK - ai/mostlyai.svg?style=social) - Synthetic Data SDK is a Python toolkit for high-fidelity, privacy-safe synthetic data.
Industry Strength Reinforcement Learning
- AReaL - AReaL is a reinforcement learning library.
- OpenRLHF - OpenRLHF is an open-source framework for reinforcement learning from human feedback (RLHF).
- RL2 - RL2 is a reinforcement learning library.
- RLinf - RLinf is a reinforcement learning library.
- ROLL - ROLL is a reinforcement learning library.
- Acme - deepmind/acme.svg?style=social) - Acme is a library of reinforcement learning (RL) building blocks that strives to expose simple, efficient, and readable agents.
- CleanRL - CleanRL is a Deep Reinforcement Learning library that provides high-quality single-file implementation with research-friendly features. The implementation is clean and simple, yet we can scale it to run thousands of experiments using AWS Batch.
- CompilerGym - CompilerGym is a library of easy to use and performant reinforcement learning environments for compiler tasks.
- d3rlpy - d3rlpy is an offline deep reinforcement learning library for practitioners and researchers.
- rLLM - project/rllm.svg?style=social) - rLLM is an open-source framework for post-training language agents via reinforcement learning
- veRL - veRL (HybridFlow) is a flexible, efficient and industrial-level RL(HF) training framework designed for LLMs.
- MLGym - MLGym is a gym environment enabling research on reinforcement learning (RL) algorithms for training such agents for ML tasks.
- D4RL - Foundation/D4RL.svg?style=social) - D4RL is an open-source benchmark for offline reinforcement learning.
- Dopamine - Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. It aims to fill the need for a small, easily grokked codebase in which users can freely experiment with wild ideas (speculative research).
- EvoTorch - EvoTorch is an open source evolutionary computation library developed at NNAISENSE, built on top of PyTorch.
- FinRL - Foundation/FinRL.svg?style=social) - FinRL is the first open-source framework to demonstrate the great potential of financial reinforcement learning.
- Gymnasium - Foundation/Gymnasium.svg?style=social) - Gymnasium is an open source Python library for developing and comparing reinforcement learning algorithms by providing a standard API to communicate between learning algorithms and environments, as well as a standard set of environments compliant with that API.
- Gymnasium-Robotics - Foundation/Gymnasium-Robotics.svg?style=social) - Gymnasium-Robotics contains a collection of Reinforcement Learning robotic environments that use the Gymansium API. The environments run with the MuJoCo physics engine and the maintained mujoco python bindings.
- Jumanji - Jumanji is a suite of Reinforcement Learning (RL) environments written in JAX providing clean, hardware-accelerated environments for industry-driven research.
- MARLlib - MARL/MARLlib.svg?style=social) - MARLlib is a comprehensive Multi-Agent Reinforcement Learning algorithm library based on RLlib. It provides MARL research community with a unified platform for building, training, and evaluating MARL algorithms.
- Mava - Mava is a framework for distributed multi-agent reinforcement learning in JAX.
- MetaDrive - MetaDrive is a driving simulator that composes diverse driving scenarios for generalizable RL.
- Minigrid - Foundation/Minigrid.svg?style=social) - The Minigrid library contains a collection of discrete grid-world environments to conduct research on Reinforcement Learning. The environments follow the Gymnasium standard API and they are designed to be lightweight, fast, and easily customizable.
- MiniWorld - Foundation/Miniworld.svg?style=social) - MiniWorld is a minimalistic 3D interior environment simulator for reinforcement learning & robotics research.
- ML-Agents - Technologies/ml-agents.svg?style=social) - ML-Agents is an open-source project that enables games and simulations to serve as environments for training reinforcement learning intelligent agents.
- MushroomRL - rl.svg?style=social) - MushroomRL is a Python reinforcement learning (RL) library whose modularity allows to easily use well-known Python libraries for tensor computation (e.g. PyTorch, Tensorflow) and RL benchmarks (e.g. OpenAI Gym, PyBullet, Deepmind Control Suite).
- OmniSafe - Alignment/omnisafe.svg?style=social) - OmniSafe is an infrastructural framework designed to accelerate safe reinforcement learning (RL) research.
- PARL - PARL is a flexible and high-efficient reinforcement learning framework.
- PettingZoo - Foundation/PettingZoo.svg?style=social) - PettingZoo is a Python library for conducting research in multi-agent reinforcement learning, akin to a multi-agent version of Gymnasium.
- ranx - ranx is a library of fast ranking evaluation metrics implemented in Python, leveraging Numba for high-speed vector operations and automatic parallelization.
- RL4CO - RL4CO is a PyTorch library for all things reinforcement learning for combinatorial optimization (CO).
- skrl - SM/skrl.svg?style=social) - skrl is an open-source modular library for Reinforcement Learning written in Python (using PyTorch) and designed with a focus on readability, simplicity, and transparency of algorithm implementation.
- Stable Baselines - RM/stable-baselines3.svg?style=social) - A fork of OpenAI Baselines, implementations of reinforcement learning algorithms.
- TF-Agents - A reliable, scalable and easy to use TensorFlow library for contextual bandits and reinforcement learning.
- TRL - Train transformer language models with reinforcement learning.
- slime - slime is an LLM post-training framework for RL Scaling.
- Melting Pot - deepmind/meltingpot.svg?style=social) - Melting Pot is a suite of test scenarios for multi-agent reinforcement learning.
- TorchRL - TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch.
Industry-strength Anomaly Detection
- Darts - Darts is a library for user-friendly forecasting and anomaly detection on time series.
- Alibi Detect - detect.svg?style=social) - alibi-detect is a Python package focused on outlier, adversarial and concept drift detection.
- Deequ - A library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
- PyOD - A Python Toolbox for Scalable Outlier Detection (Anomaly Detection).
- TFDV - validation.svg?style=social) - TFDV (Tensorflow Data Validation) is a library for exploring and validating machine learning data.
Industry Strength Information Retrieval
- fastRAG - fastRAG is a research framework for efficient and optimized retrieval augmented generative pipelines, incorporating state-of-the-art LLMs and Information Retrieval.
- HippoRAG - NLP-Group/HippoRAG.svg?style=social) - HippoRAG is a novel retrieval augmented generation (RAG) framework inspired by the neurobiology of human long-term memory that enables LLMs to continuously integrate knowledge across external documents.
- BGE - BGE builds one-stop retrieval toolkit for search and RAG.
- GraphRAG - GraphRAG is a data pipeline and transformation suite that is designed to extract meaningful, structured data from unstructured text using the power of LLMs.
- LightRAG - A simple and fast retrieval-augmented generation framework.
- NMSLIB - Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.
- Qdrant - An open source vector similarity search engine with extended filtering support.
- RAGFlow - RAGFlow is a RAG engine based on deep document understanding.
- RAGxplorer - RAGxplorer is a tool to build RAG visualisations.
- Vanna - ai/vanna.svg?style=social) - Vanna is a RAG framework for SQL generation and related functionality.
- R2R - AI/R2R.svg?style=social) - R2R (RAG to Riches) is a comprehensive platform for building, deploying, and scaling RAG applications with hybrid search, multimodal support, and advanced observability.
- RAG-FiT - FiT.svg?style=social) - RAG-FiT is a library designed to improve LLMs ability to use external information by fine-tuning models on specially created RAG-augmented datasets.
- TextWorld - TextWorld is a text-based game generator and extensible sandbox learning environment for training and testing reinforcement learning (RL) agents.
- AutoRAG - Inc-Korea/AutoRAG.svg?style=social) - AutoRAG is a RAG AutoML tool for automatically finds an optimal RAG pipeline for your data.
- Cognita - Cognita is a RAG framework for building modular and production-ready applications.
- DocArray - DocArray is a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh, etc. It allows deep-learning engineers to efficiently process, embed, search, recommend, store, and transfer multimodal data with a Pythonic API.
- Faiss - Faiss is a library for efficient similarity search and clustering of dense vectors.
- JamAI Base - JamAI Base is an open-source RAG (Retrieval-Augmented Generation) backend platform that integrates an embedded database (SQLite) and an embedded vector database (LanceDB) with managed memory and RAG capabilities. It features built-in LLM, vector embeddings, and reranker orchestration and management, all accessible through a convenient, intuitive, spreadsheet-like UI and a simple REST API.
- llmware - ai/llmware.svg?style=social) - llmware provides a unified framework for building LLM-based applications (e.g, RAG, Agents), using small, specialized models that can be deployed privately, integrated with enterprise knowledge sources safely and securely, and cost-effectively tuned and adapted for any business process.
- Mem0 - Mem0 enhances AI assistants and agents with an intelligent memory layer, enabling personalized AI interactions.
- NGT - NGT provides commands and a library for performing high-speed approximate nearest neighbor searches against a large volume of data in high dimensional vector data space.
- LangExtract - LangExtract is a Python library that uses LLMs to extract structured information from unstructured text documents based on user-defined instructions. It processes materials such as clinical notes or reports, identifying and organizing key details while ensuring the extracted data corresponds to the source text.
Feature Store
- Feathr - ai/feathr.svg?style=social) - A scalable, unified data and AI engineering platform for enterprise
- Butterfree - A tool for building feature stores which allows you to transform your raw data into beautiful features.
- FEAST - dev/feast.svg?style=social) - Feast (Feature Store) is an open source feature store for machine learning. Feast is the fastest path to manage existing infrastructure to productionize analytic data for model training and online inference.
- Featureform - A virtual featurestore. Plug-&-play with your existing infra. Data Scientist approved. Discovery, Governance, Lineage, & Collaboration just a pip install away. Supports pandas, Python, spark, SQL + integrations with major cloud vendors.
Industry-strength AD
- TextAttack - TextAttack is a Python framework for adversarial attacks, data augmentation, and model training in NLP.
- TODS - TODS is a full-stack automated machine learning system for outlier detection on multivariate time-series data.
- adtk - A Python toolkit for rule-based/unsupervised anomaly detection in time series.
- Deep Anomaly Detection with Outlier Exposure - exposure.svg?style=social) - Outlier Exposure (OE) is a method for improving anomaly detection performance in deep learning models. [Paper](https://arxiv.org/pdf/1812.04606.pdf)
- SUOD - SUOD (Scalable Unsupervised Outlier Detection) is an acceleration system for large-scale anomaly/outlier detection.
DS Notebook
- H2O Flow - flow.svg?style=social) - Jupyter notebook-like interface for H2O to create, save and re-use "flows".
- ML Workspace - tooling/ml-workspace.svg?style=social) - All-in-one web IDE for machine learning and data science. Combines Jupyter, VS Code, Tensorflow, and many other tools/libraries into one Docker image.
Industry Strength CV
- JDiffusion - JDiffusion is a diffusion model library for generating images or videos based on Diffusers and Jittor.
- MMDetection - mmlab/mmdetection.svg?style=social) - MMDetection is an open source object detection toolbox based on PyTorch.
- SCEPTER - SCEPTER is an open-source code repository dedicated to generative training, fine-tuning, and inference, encompassing a suite of downstream tasks such as image generation, transfer, editing.
- VISSL - VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
- iGibson - iGibson is a simulation environment providing fast visual rendering and physics simulation based on Bullet.
Industry Strength Computer Vision
- Deep Lake - Deep Lake is a data infrastructure optimized for computer vision.
- VideoSys - HPC-AI-Lab/VideoSys.svg?style=social) - VideoSys supports many diffusion models with our various acceleration techniques, enabling these models to run faster and consume less memory.
- LightlyTrain - ai/lightly-train.svg?style=social) - Pretrain computer vision models on unlabeled data for industrial applications.
- libcom - libcom is an image composition toolbox.
- Detectron2 - Detectron2 is Facebook AI Research's next generation library that provides state-of-the-art detection and segmentation algorithms.
- KerasCV - team/keras-cv.svg?style=social) - KerasCV is a library of modular computer vision oriented Keras components.
- LAVIS - LAVIS is a deep learning library for LAnguage-and-VISion intelligence research and applications.
- SuperGradients - AI/super-gradients.svg?style=social) - SuperGradients is an open-source library for training PyTorch-based computer vision models.
- supervision - Supervision is a Python library designed for efficient computer vision pipeline management, providing tools for annotation, visualization, and monitoring of models.
- MMCV - mmlab/mmcv.svg?style=social) - MMCV is a foundational computer vision library from OpenMMLab that provides essential functionalities like image and video processing, data transformation and augmentation, CNN architectures, and optimized CUDA operations.
- Kornia - Kornia is a differentiable computer vision library built on PyTorch that provides a rich set of differentiable image processing and geometric vision algorithms.
Industry Strength RecSys
- Implicit - Implicit provides fast Python implementations of several different popular recommendation algorithms for implicit feedback datasets
- LightFM - LightFM is a Python implementation of a number of popular recommendation algorithms for both implicit and explicit feedback
- NVTabular - Merlin/NVTabular.svg?style=social) - NVTabular is a feature engineering and preprocessing library for tabular data that is designed to easily manipulate terabyte scale datasets and train deep learning (DL) based recommender systems.
- Surprise - Surprise is a Python scikit for building and analyzing recommender systems that deal with explicit rating data.
- YouTokenToMe - YouTokenToMe is an unsupervised text tokenizer focused on computational efficiency. It currently implements fast [Byte Pair Encoding](https://arxiv.org/abs/1508.07909) (BPE).
Model Storage Optimisation
- AWQ - han-lab/llm-awq.svg?style=social) - Activation-aware Weight Quantization for LLM Compression and Acceleration.
- Quanto - quanto.svg?style=social) - Quanto aims to simplify quantizing deep learning models.
- AutoAWQ - hansen/AutoAWQ.svg?style=social) - AutoAWQ is an easy-to-use package for 4-bit quantized models.
- neural-compressor - compressor.svg?style=social) - Intel® Neural Compressor aims to provide popular model compression techniques such as quantization, pruning (sparsity), distillation, and neural architecture search on mainstream frameworks.
- GPTQ - DASLab/gptq.svg?style=social) - Accurate Post-training Quantization of Generative Pretrained Transformers.
- AutoGPTQ - An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
- MMdnn - MMdnn is a comprehensive cross-framework tool from Microsoft that facilitates model conversion, visualization, and deployment across various deep learning frameworks.
- NNEF - Neural Network Exchange Format (NNEF) is an open standard for representing neural network models to enable interoperability and portability across different machine learning frameworks and platforms.
- ONNX - ONNX (Open Neural Network Exchange) is an open-source format designed to facilitate interoperability and portability of machine learning models across different frameworks and platforms.
- PFA - PFA (Portable Format for Analytics) format is a standard for representing and exchanging predictive models and analytics workflows in a portable, JSON-based format.
- PMML - PMML (Predictive Model Markup Language) is an XML-based standard for representing and sharing predictive models between different applications.
- GGML - org/ggml.svg?style=social) - GGML is a high-performance, tensor library for machine learning that enables efficient inference on CPUs, particularly optimized for large language models.
Neural Search and Retrieval
- Annoy - Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point.
- Finetuner - ai/finetuner.svg?style=social) - Finetuner provides an effective way to improve performance on neural search tasks.
- BeyondLLM - Beyond LLM offers an all-in-one toolkit for experimentation, evaluation, and deployment of RAG systems, simplifying the process with automated integration, customizable evaluation metrics, and support for various LLMs tailored to specific needs, ultimately aiming to reduce LLM hallucination risks and enhance reliability.
- CLIP-as-service - ai/clip-as-service.svg?style=social) - CLIP-as-service is a low-latency high-scalability service for embedding images and text. It can be easily integrated as a microservice into neural search solutions.
- MindSQL - MindSQL is a Python RAG library to streamline the interaction between users and their databases using just a few lines of code.
- Rule-based Retrieval - ai/rule-based-retrieval.svg?style=social) - Rule-based Retrieval enables users to create and manage RAG applications with advanced filtering capabilities.
Optimized Computation
- BrainCog - X/Brain-Cog.svg?style=social) - BrainCog (Brain-inspired Cognitive Intelligence Engine) is a brain-inspired spiking neural network based platform for Brain-inspired Artificial Intelligence and simulating brains at multiple scales.
- NumpyGroupies - groupies.svg?style=social) Optimised tools for group-indexing operations: aggregated sum and more.
- OpenFlamingo - OpenFlamingo is an open-source framework for training large multimodal models.
- Tensor2Tensor - Tensor2Tensor is a library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
- Weld - project/weld.svg?style=social) - High-performance runtime for data analytics applications, Here is an [interview](https://www.notamonadtutorial.com/weld-accelerating-numpy-scikit-and-pandas-as-much-as-100x-with-rust-and-llvm) with Weld’s main contributor.
Industry Strength Recommender System
- TorchRec - pytorch/torchrec.svg?cacheSeconds=86400) - TorchRec is a PyTorch domain library built to provide common sparsity and parallelism primitives needed for large-scale recommender systems (RecSys).
- EasyRec - EasyRec is a framework for large scale recommendation algorithms.
- Gorse - io/gorse.svg?style=social) - Gorse aims to be a universal open-source recommender system that can be quickly introduced into a wide variety of online services.
- Merlin - Merlin/Merlin.svg?style=social) - NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.
- Recommenders - team/recommenders.svg?style=social) - Recommenders contains benchmark and best practices for building recommendation systems, provided as Jupyter notebooks.
Industry Strength Robotics
- AI2-THOR - AI2-THOR is a near photo-realistic interactable framework for AI agents.
- Habitat-Sim - sim.svg?cacheSeconds=86400) - Habitat-Sim is a flexible, high-performance 3D simulator for Embodied AI research.
- IsaacLab - sim/IsaacLab.svg?cacheSeconds=86400) - IsaacLab is a unified and modular framework for robot learning that leverages NVIDIA Isaac Sim.
- robosuite - Initiative/robosuite.svg?cacheSeconds=86400) - robosuite is a simulation framework powered by the MuJoCo physics engine for robot learning.
- RoboVerse - RoboVerse is a comprehensive robotics simulation platform with diverse environments.

Programming Languages

Python 466 Jupyter Notebook 58 C++ 40 Java 29 TypeScript 29 Go 23 Rust 12 JavaScript 8 Scala 8 HTML 7

awesome-production-machine-learning

Agentic Framework

Data Storage Optimisation

Model Training Orchestration

Adversarial Robustness

AutoML

Computation Load Distribution

Model Serialisation

Data Stream Processing

Industry Strength NLP

Industry Strength RL

Commercial Platform

Data Science Notebook

Metadata Management

Industry Strength Natural Language Processing

Deployment and Serving

Explainability and Fairness

Explaining Black Box Models and Datasets

Industry Strength Evaluation

Privacy and Security

Industry Strength Visualisation

Model, Data and Experiment Management

Model, Data and Experiment Tracking

Model Training and Orchestration

Training Orchestration

Computation and Communication Optimisation

Model Serving and Monitoring

Evaluation and Monitoring

Privacy and Safety

Data Pipeline

Data Annotation and Synthesis

Data Labelling and Synthesis

Industry Strength Reinforcement Learning

Industry-strength Anomaly Detection

Industry Strength Information Retrieval

Feature Store

Industry-strength AD

DS Notebook

Industry Strength CV

Industry Strength Computer Vision

Industry Strength RecSys

Model Storage Optimisation

Neural Search and Retrieval

Optimized Computation

Industry Strength Recommender System

Industry Strength Robotics