awesome-mlops
:sunglasses: A curated list of awesome MLOps tools
https://github.com/kelvins/awesome-mlops
Last synced: 9 days ago
JSON representation
-
Other Lists
-
Data Validation
- JSON Schema - A vocabulary that allows you to annotate and validate JSON documents.
- Cerberus - Lightweight, extensible data validation library for Python.
- Cleanlab - Python library for data-centric AI and machine learning with messy, real-world data and labels.
- TFDV - An library for exploring and validating machine learning data.
-
Data Exploration
- Jupyter Notebook - Web-based notebook environment for interactive computing.
- Apache Zeppelin - Enables data-driven, interactive data analytics and collaborative documents.
- Polynote - The polyglot notebook with first-class Scala support.
- Jupytext - Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts.
- DataPrep - Collect, clean and visualize your data in Python.
- BambooLib - An intuitive GUI for Pandas DataFrames.
- Pandas Profiling - Create HTML profiling reports from pandas DataFrame objects.
- Deepnote - Drop-in replacement for Jupyter and an AI-native workspace for modern data teams.
-
Data Processing
- Airflow - Platform to programmatically author, schedule, and monitor workflows.
- Spark - Unified analytics engine for large-scale data processing.
- Hadoop - Framework that allows for the distributed processing of large data sets across clusters.
- OpenRefine - Power tool for working with messy data and improving it.
- Dagster - A data orchestrator for machine learning, analytics, and ETL.
- Azkaban - Batch workflow job scheduler created at LinkedIn to run Hadoop jobs.
-
Model Lifecycle
- Neptune AI - The most lightweight experiment management tool that fits any workflow.
- MLflow - Open source platform for the machine learning lifecycle.
- Guild AI - Open source experiment tracking, pipeline automation, and hyperparameter tuning.
- Comet - Track your datasets, code changes, experimentation history, and models.
- Aim - A super-easy way to record, search and compare 1000s of ML training runs.
- Comet - Track your datasets, code changes, experimentation history, and models.
- Sacred - A tool to help you configure, organize, log and reproduce experiments.
- Keepsake - Version control for machine learning with support to Amazon S3 and Google Cloud Storage.
- Aeromancy - A framework for performing reproducible AI and ML for Weights and Biases.
- Cascade - Library of ML-Engineering tools for rapid prototyping and experiment management.
- ModelDB - Open source ML model versioning, metadata, and experiment management.
- Weights and Biases - A tool for visualizing and tracking your machine learning experiments.
- Losswise - Makes it easy to track the progress of a machine learning project.
-
Optimization Tools
- MLlib - Apache Spark's scalable machine learning library.
- Mahout - Distributed linear algebra framework and mathematically expressive Scala DSL.
- Dask - Provides advanced parallelism for analytics, enabling performance at scale for the tools you love.
- Rapids - Gives the ability to execute end-to-end data science and analytics pipelines entirely on GPUs.
- Singa - Apache top level project, focusing on distributed training of DL and ML models.
- Horovod - Distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
- Petastorm - Enables single machine or distributed training and evaluation of deep learning models.
- Tpot - Automated ML tool that optimizes machine learning pipelines using genetic programming.
- Ray - Fast and simple framework for building and running distributed applications.
- Accelerate - A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision.
- Modin - Speed up your Pandas workflows by changing a single line of code.
- Nos - Open-source module for running AI workloads on Kubernetes in an optimized way.
- Fiber - Python distributed computing library for modern computer clusters.
- Nebullvm - Easy-to-use library to boost AI inference.
- DeepSpeed - Deep learning optimization library that makes distributed training easy, efficient, and effective.
- Singa - Apache top level project, focusing on distributed training of DL and ML models.
-
Machine Learning Platform
- Sagemaker - Fully managed service that provides the ability to build, train, and deploy ML models quickly.
- Kubeflow - Making deployments of ML workflows on Kubernetes simple, portable and scalable.
- DataRobot - AI platform that democratizes data science and automates the end-to-end ML at scale.
- DAGsHub - A platform built on open source tools for data, model and pipeline management.
- Dataiku - Platform democratizing access to data and enabling enterprises to build their own path to AI.
- Iguazio - Data science platform that automates MLOps with end-to-end machine learning pipelines.
- Katonic - Automate your cycle of intelligence with Katonic MLOps Platform.
- SigOpt - A platform that makes it easy to track runs, visualize training, and scale hyperparameter tuning.
- Valohai - MLOps platform for reproducible ML and LLM workflows from experimentation to production.
- aiWARE - aiWARE helps MLOps teams evaluate, deploy, integrate, scale & monitor ML models.
- Algorithmia - Securely govern your machine learning operations with a healthy ML lifecycle.
- Allegro AI - Transform ML/DL research into products. Faster.
- Bodywork - Deploys machine learning projects developed in Python, to Kubernetes.
- CNVRG - An end-to-end machine learning platform to build and deploy AI models at scale.
- Edge Impulse - Platform for creating, optimizing, and deploying AI/ML algorithms for edge devices.
- FedML - Simplifies the workflow of federated learning anywhere at any scale.
- Gradient - Multicloud CI/CD and MLOps platform for machine learning teams.
- Hopsworks - Open-source platform for developing and operating machine learning models at scale.
- Knime - Create and productionize data science using one easy and intuitive environment.
- LynxKite - A complete graph data science platform for very large graphs and other datasets.
- Modzy - Deploy, connect, run, and monitor machine learning (ML) models in the enterprise and at the edge.
- Omnimizer - Simplifies and accelerates MLOps by bridging the gap between ML models and edge hardware.
- Pachyderm - Combines data lineage with end-to-end pipelines on Kubernetes, engineered for the enterprise.
- SAS Viya - Cloud native AI, analytic and data management platform that supports the analytics life cycle.
- Sematic - An open-source end-to-end pipelining tool to go from laptop prototype to cloud in no time.
- TrueFoundry - A Cloud-native MLOps Platform over Kubernetes to simplify training and serving of ML Models.
- Neu.ro - MLOps platform that integrates open-source and proprietary tools into client-oriented systems.
- ML Workspace - All-in-one web-based IDE specialized for machine learning and data science.
- MLReef - Open source MLOps platform that helps you collaborate, reproduce and share your ML work.
- envd - Machine learning development environment for data science and AI/ML engineering teams.
- SigOpt - A platform that makes it easy to track runs, visualize training, and scale hyperparameter tuning.
- Omnimizer - Simplifies and accelerates MLOps by bridging the gap between ML models and edge hardware.
- Omnimizer - Simplifies and accelerates MLOps by bridging the gap between ML models and edge hardware.
- Neurolink - TypeScript-first multi-provider AI agent framework with workflow orchestration and MCP support.
- Neu.ro - MLOps platform that integrates open-source and proprietary tools into client-oriented systems.
- aiWARE - aiWARE helps MLOps teams evaluate, deploy, integrate, scale & monitor ML models.
- Allegro AI - Transform ML/DL research into products. Faster.
- Polyaxon - A platform for reproducible and scalable machine learning and deep learning on kubernetes.
- Sematic - An open-source end-to-end pipelining tool to go from laptop prototype to cloud in no time.
- TrueFoundry - A Cloud-native MLOps Platform over Kubernetes to simplify training and serving of ML Models.
-
Cron Job Monitoring
- HealthchecksIO - Simple and effective cron job monitoring.
- Cronitor - Monitor any cron job or scheduled task.
- Heartbeat.pm - Monitoring aliveness of any sensor/cron job.
-
Data Visualization
- Metabase - The simplest, fastest way to get business intelligence and analytics to everyone.
- Redash - Connect to any data source, easily visualize, dashboard and share your data.
- Data Studio - Reporting solution for power users who want to go beyond the data and dashboards of GA.
- Grafana - Multi-platform open source analytics and interactive visualization web application.
- Dash - Analytical Web Apps for Python, R, Julia, and Jupyter.
- Facets - Visualizations for understanding and analyzing machine learning datasets.
- Lux - Fast and easy data exploration by automating the visualization and data analysis process.
- SolidUI - AI-generated visualization prototyping and editing platform, support 2D and 3D models.
-
Podcasts
- Practical AI: Machine Learning, Data Science
- This Week in Machine Learning & AI
- Kubernetes Podcast from Google
- Pipeline Conversation
- True ML Talks
- AI Stories Podcast
- True ML Talks
- Practical AI: Machine Learning, Data Science
- AI Stories Podcast
- Machine Learning – Software Engineering Daily
- MLOps.community
- True ML Talks
-
Hyperparameter Tuning
- Optuna - Open source hyperparameter optimization framework to automate hyperparameter search.
- Tune - Python library for experiment execution and hyperparameter tuning at any scale.
- Hyperopt - Distributed Asynchronous Hyperparameter Optimization in Python.
- Hyperas - A very simple wrapper for convenient hyperparameter optimization.
- Talos - Hyperparameter Optimization for TensorFlow, Keras and PyTorch.
- KerasTuner - Easy-to-use, scalable hyperparameter optimization framework.
- Advisor - Open-source implementation of Google Vizier for hyper parameters tuning.
- Katib - Kubernetes-based system for hyperparameter tuning and neural architecture search.
- Scikit Optimize - Simple and efficient library to minimize expensive and noisy black-box functions.
-
Workflow Tools
- Metaflow - Human-friendly lib that helps scientists and engineers build and manage data science projects.
- Flyte - Easy to create concurrent, scalable, and maintainable workflows for machine learning.
- Automate Studio - Rapidly build & deploy AI-powered workflows.
- Luigi - Python module that helps you build complex pipelines of batch jobs.
- Kale - Aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows.
- Ploomber - Write maintainable, production-ready pipelines. Develop locally, deploy to the cloud.
- Couler - Unified interface for constructing and managing workflows on different workflow engines.
- dstack - An open-core tool to automate data and training workflows.
- MLRun - Generic mechanism for data scientists to build, run, and monitor ML tasks and pipelines.
- Argo - Open source container-native workflow engine for orchestrating parallel jobs on Kubernetes.
- Kedro - Library that implements software engineering best-practice for data and ML pipelines.
- Velda - Run jobs and workflows as if on your local machine.
- VDP - An open-source tool to seamlessly integrate AI for unstructured data into the modern data stack.
- ZenML - An extensible open-source MLOps framework to create reproducible pipelines.
- Cordum - Governance-first control plane for AI agents and external workers.
- Hamilton - A scalable general purpose micro-framework for defining dataflows.
- Orchest - Visual pipeline editor and workflow orchestrator with an easy to use UI and based on Kubernetes.
- Velda - Run jobs and workflows as if on your local machine.
-
Websites
-
Data Management
- DVC - Management and versioning of datasets and machine learning models.
- Arrikto - Dead simple, ultra fast storage for the hybrid Kubernetes world.
- Quilt - A self-organizing data hub with S3 support.
- Dolt - SQL database that you can fork, clone, branch, merge, push and pull just like a git repository.
- Qdrant - An open source vector similarity search engine with extended filtering support.
- Dud - A lightweight CLI tool for versioning data alongside source code and building data pipelines.
- Delta Lake - Storage layer that brings scalable, ACID transactions to Apache Spark and other engines.
- lakeFS - Repeatable, atomic and versioned data lake on top of object storage.
- BlazingSQL - A lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
- Marquez - Collect, aggregate, and visualize a data ecosystem's metadata.
- Intake - A lightweight set of tools for loading and sharing data in data science projects.
- Hub - A dataset format for creating, storing, and collaborating on AI datasets of any size.
- Milvus - An open source embedding vector similarity search engine powered by Faiss, NMSLIB and Annoy.
- Git LFS - An open source Git extension for versioning large files.
-
Simplification Tools
- PyCaret - Open source, low-code machine learning library in Python.
- Chassis - Turns models into ML-friendly containers that run just about anywhere.
- TrainGenerator - A web app to generate template code for machine learning.
- Turi Create - Simplifies the development of custom machine learning models.
- Koalas - Pandas API on Apache Spark. Makes data scientists more productive when interacting with big data.
- Hydra - A framework for elegantly configuring complex applications.
- Soopervisor - Export ML projects to Kubernetes (Argo workflows), Airflow, AWS Batch, and SLURM.
- Soorgeon - Convert monolithic Jupyter notebooks into maintainable pipelines.
- MLNotify - No need to keep checking your training, just one import line and you'll know the second it's done.
- Hermione - Help Data Scientists on setting up more organized codes, in a quicker and simpler way.
- Sagify - A CLI utility to train and deploy ML/DL models on AWS SageMaker.
- Ludwig - Allows users to train and test deep learning models without the need to write code.
- Chassis - Turns models into ML-friendly containers that run just about anywhere.
-
Visual Analysis and Debugging
- Fiddler - Monitor, explain, and analyze your AI in production.
- Aporia - Observability with customized monitoring and explainability for ML models.
- Superwise - Fully automated, enterprise-grade model observability in a self-service SaaS platform.
- Netron - Visualizer for neural network, deep learning, and machine learning models.
- Yellowbrick - Visual analysis and diagnostic tools to facilitate machine learning model selection.
- Evidently - Interactive reports to analyze ML models during validation or production monitoring.
- Manifest - Open-source real-time cost observability for AI agents.
- Whylogs - The open source standard for data logging. Enables ML monitoring and observability.
- Manifold - A model-agnostic visual debugging tool for machine learning.
- NannyML - Algorithm capable of fully capturing the impact of data drift on performance.
- Opik - Evaluate, test, and ship LLM applications with a suite of observability tools.
- Radicalbit - The open source solution for monitoring your AI models in production.
- Rhesis - Testing infrastructure for LLM and agentic applications with collaborative evaluation.
- Arize - A free end-to-end ML observability and model monitoring platform.
- Radicalbit - The open source solution for monitoring your AI models in production.
- Superwise - Fully automated, enterprise-grade model observability in a self-service SaaS platform.
-
Feature Store
- Tecton - A fully-managed feature platform built to orchestrate the complete lifecycle of features.
- Feast - End-to-end open source feature store for machine learning.
- Featureform - A Virtual Feature Store. Turn your existing data infrastructure into a feature store.
- Butterfree - A tool for building feature stores. Transform your raw data into beautiful features.
- ByteHub - An easy-to-use feature store. Optimized for time-series data.
- Feathr - An enterprise-grade, high performance feature store.
-
Model Serving
- Cortex - Machine learning model serving infrastructure.
- Banana - Host your ML inference code on serverless GPUs and integrate it into your app with one line of code.
- Beam - Develop on serverless GPUs, deploy highly performant APIs, and rapidly prototype ML models.
- Quix - Serverless platform for processing data streams in real-time with machine learning models.
- Seldon - Take your ML projects from POC to production with maximum efficiency and minimal risk.
- TensorFlow Serving - Flexible, high-performance serving system for ML models, designed for production.
- Geniusrise - Host inference APIs, bulk inference and fine tune text, vision, audio and multi-modal models.
- Wallaroo.AI - A platform for deploying, serving, and optimizing ML models in both cloud and edge environments.
- BentoML - Open-source platform for high-performance ML model serving.
- LocalAI - Drop-in replacement REST API that’s compatible with OpenAI API specifications for inferencing.
- Streamlit - Lets you create apps for your ML projects with deceptively simple Python scripts.
- Opyrator - Turns your ML code into microservices with web API, interactive GUI, and more.
- Vespa - Store, search, organize and make machine-learned inferences over big data at serving time.
- Gradio - Create customizable UI components around your models.
- Triton Inference Server - Provides an optimized cloud and edge inferencing solution.
- Hydrosphere - Platform for deploying your Machine Learning to production.
- MLEM - Version and deploy your ML models following GitOps principles.
- PredictionIO - Event collection, deployment of algorithms, evaluation, querying predictive results via APIs.
- Cog - Open-source tool that lets you package ML models in a standard, production-ready container.
- TorchServe - A flexible and easy to use tool for serving PyTorch models.
- BudgetML - Deploy a ML inference service on a budget in less than 10 lines of code.
- Rune - Provides containers to encapsulate and deploy EdgeML pipelines and applications.
- GraphPipe - Machine learning model deployment made simple.
- Merlin - A platform for deploying and serving machine learning models.
- KFServing - Kubernetes custom resource definition for serving ML models on arbitrary frameworks.
- Banana - Host your ML inference code on serverless GPUs and integrate it into your app with one line of code.
- Beam - Develop on serverless GPUs, deploy highly performant APIs, and rapidly prototype ML models.
- Geniusrise - Host inference APIs, bulk inference and fine tune text, vision, audio and multi-modal models.
- GraphPipe - Machine learning model deployment made simple.
- Merlin - A platform for deploying and serving machine learning models.
- Quix - Serverless platform for processing data streams in real-time with machine learning models.
-
AutoML
- H2O AutoML - Automates ML workflow, which includes automatic training and tuning of models.
- AutoSKLearn - Automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.
- FLAML - Finds accurate ML models automatically, efficiently and economically.
- NNI - An open source AutoML toolkit for automate machine learning lifecycle.
- MindsDB - AI layer for databases that allows you to effortlessly develop, train and deploy ML models.
- AutoKeras - AutoKeras goal is to make machine learning accessible for everyone.
- Model Search - Framework that implements AutoML algorithms for model architecture search at scale.
- AutoPyTorch - Automatic architecture search and hyperparameter optimization for PyTorch.
- EvalML - A library that builds, optimizes, and evaluates ML pipelines using domain-specific functions.
- MLBox - MLBox is a powerful Automated Machine Learning python library.
- AutoGluon - Automated machine learning for image, text, tabular, time-series, and multi-modal data.
- H2O AutoML - Automates ML workflow, which includes automatic training and tuning of models.
-
Data Catalog
- Amundsen - Data discovery and metadata engine for improving the productivity when interacting with data.
- Apache Atlas - Provides open metadata management and governance capabilities to build a data catalog.
- OpenMetadata - A Single place to discover, collaborate and get your data right.
- Magda - A federated, open-source data catalog for all your big data and small data.
- CKAN - Open-source DMS (data management system) for powering data hubs and data portals.
- Metacat - Unified metadata exploration API service for Hive, RDS, Teradata, Redshift, S3 and Cassandra.
- DataHub - LinkedIn's generalized metadata search & discovery tool.
-
Articles
- A Tour of End-to-End Machine Learning Platforms
- Continuous Delivery for Machine Learning
- Machine Learning Operations (MLOps): Overview, Definition, and Architecture
- MLOps: Continuous delivery and automation pipelines in machine learning
- Rules of Machine Learning: Best Practices for ML Engineering
- The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction
- What Is MLOps?
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps: Machine Learning as an Engineering Discipline
- MLOps Roadmap: A Complete MLOps Career Guide
- Practitioners guide to MLOps: A framework for continuous delivery and automation of machine learning
- MLOps: Continuous delivery and automation pipelines in machine learning
- What Is MLOps?
-
Books
- Beginning MLOps with MLFlow
- Building Machine Learning Pipelines
- Building Machine Learning Powered Applications
- Deep Learning in Production
- Designing Machine Learning Systems
- Engineering MLOps
- Implementing MLOps in the Enterprise
- Introducing MLOps
- Kubeflow for Machine Learning
- Kubeflow Operations Guide
- Machine Learning Design Patterns
- Machine Learning Engineering in Action
- ML Ops: Operationalizing Data Science
- MLOps Engineering at Scale
- Practical Deep Learning at Scale with MLflow
- Practical MLOps
- Production-Ready Applied Deep Learning
- Reliable Machine Learning
- The Machine Learning Solutions Architect Handbook
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- Building Machine Learning Pipelines
- Building Machine Learning Powered Applications
- Designing Machine Learning Systems
- Implementing MLOps in the Enterprise
- Introducing MLOps
- Kubeflow for Machine Learning
- Kubeflow Operations Guide
- Machine Learning Design Patterns
- ML Ops: Operationalizing Data Science
- MLOps Lifecycle Toolkit
- Practical MLOps
- Reliable Machine Learning
- AI Governance
- AI Model Evaluation
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- MLOps Lifecycle Toolkit
- Engineering MLOps
- Practical Deep Learning at Scale with MLflow
- Production-Ready Applied Deep Learning
- The Machine Learning Solutions Architect Handbook
-
Events
- apply() - The ML data engineering conference
- MLOps Conference - Keynotes and Panels
- MLOps World: Machine Learning in Production Conference
- NormConf - The Normcore Tech Conference
- Stanford MLSys Seminar Series
- apply() - The ML data engineering conference
- MLOps Conference - Keynotes and Panels
- AI Conference Deadline
- MLOps Conference - Keynotes and Panels
-
Slack
-
Model Interpretability
- InterpretML - A toolkit to help understand models and enable responsible machine learning.
- Lucid - Collection of infrastructure and tools for research in neural network interpretability.
- LIME - Explaining the predictions of any machine learning classifier.
- SAGE - For calculating global feature importance using Shapley values.
- Captum - Model interpretability and understanding library for PyTorch.
- Alibi - Open-source Python library enabling ML model inspection and interpretation.
- ELI5 - Python package which helps to debug machine learning classifiers and explain their predictions.
- SHAP - A game theoretic approach to explain the output of any machine learning model.
-
CI/CD for Machine Learning
-
Data Enrichment
-
Feature Engineering
- TSFresh - Python library for automatic extraction of relevant features from time series.
- Featuretools - Python library for automated feature engineering.
- Feature Engine - Feature engineering package with SKlearn like functionality.
-
Knowledge Sharing
- Knowledge Repo - Knowledge sharing platform for data scientists and other technical professions.
- Kyso - One place for data insights so your entire team can learn from your data.
-
Drift Detection
- Alibi Detect - An open source Python library focused on outlier, adversarial and drift detection.
- Frouros - An open source Python library for drift detection in machine learning systems.
- Frouros - An open source Python library for drift detection in machine learning systems.
- TorchDrift - A data and concept drift library for PyTorch.
- ml3-drift - Drift detection algorithms seamlessly integrated with ML and AI frameworks.
-
Model Fairness and Privacy
- Fairlearn - A Python package to assess and improve fairness of machine learning models.
- AIF360 - A comprehensive set of fairness metrics for datasets and machine learning models.
- Opacus - A library that enables training PyTorch models with differential privacy.
- TensorFlow Privacy - Library for training machine learning models with privacy for training data.
-
Model Testing & Validation
- Deepchecks - Open-source package for validating ML models & data, with various checks and suites.
- Starwhale - An MLOps/LLMOps platform for model building, evaluation, and fine-tuning.
- Trubrics - Validate machine learning with data science and domain expert feedback.
Programming Languages
Categories
Books
83
Articles
41
Machine Learning Platform
40
Model Serving
31
Workflow Tools
18
Visual Analysis and Debugging
16
Optimization Tools
16
Data Management
14
Model Lifecycle
13
Simplification Tools
13
Other Lists
12
Websites
12
Podcasts
12
AutoML
12
Hyperparameter Tuning
9
Events
9
Data Visualization
8
Model Interpretability
8
Data Exploration
8
Data Catalog
7
Feature Store
6
Data Processing
6
Drift Detection
5
Model Fairness and Privacy
4
Data Validation
4
Feature Engineering
3
CI/CD for Machine Learning
3
Cron Job Monitoring
3
Slack
3
Model Testing & Validation
3
Knowledge Sharing
2
Data Enrichment
2
Sub Categories
Keywords
machine-learning
80
data-science
50
python
49
deep-learning
34
mlops
27
pytorch
20
tensorflow
17
ai
17
automl
15
scikit-learn
14
data-engineering
12
kubernetes
12
hyperparameter-optimization
11
ml
10
artificial-intelligence
10
llm
10
jupyter-notebook
9
data-visualization
8
automated-machine-learning
8
feature-engineering
8
keras
8
gpu
7
analytics
7
optimization
7
data-analysis
7
data-quality
7
visualization
7
data-drift
6
interpretability
6
jupyter
6
workflow
5
pandas
5
llmops
5
docker
5
model-serving
5
awesome
5
llms
5
neural-network
5
hyperparameter-tuning
5
developer-tools
5
machinelearning
5
deeplearning
5
distributed
5
awesome-list
5
kubeflow
4
models
4
spark
4
feature-selection
4
neural-architecture-search
4
data-validation
4