Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Awesome-LLMOps

πŸŽ‰ An awesome & curated list of best LLMOps tools.
https://github.com/InftyAI/Awesome-LLMOps

Last synced: 4 days ago
JSON representation

  • LLMOps

    • BentoML - Grade AI Applications | |
    • Dify
    • FastChat - sys/fastchat.svg) | ![Release](https://img.shields.io/github/release/lm-sys/fastchat) | ![Contributors](https://img.shields.io/github/contributors/lm-sys/fastchat) | An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena. | |
    • Flowise
    • Haystack - ai/haystack.svg) | ![Release](https://img.shields.io/github/release/deepset-ai/haystack) | ![Contributors](https://img.shields.io/github/contributors/deepset-ai/haystack) | πŸ” LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots. | |
    • LangChain - ai/langchain.svg) | ![Release](https://img.shields.io/github/release/langchain-ai/langchain) | ![Contributors](https://img.shields.io/github/contributors/langchain-ai/langchain) | ⚑ Building applications with LLMs through composability ⚑ | |
    • LiteLLM - Azure, OpenAI, Cohere, Anthropic, Replicate. Manages input/output translation | |
    • LLaMa-Factory - factory.svg) | ![Release](https://img.shields.io/github/release/hiyouga/llama-factory) | ![Contributors](https://img.shields.io/github/contributors/hiyouga/llama-factory) | Easy-to-use LLM fine-tuning framework (LLaMA, BLOOM, Mistral, Baichuan, Qwen, ChatGLM) | |
    • LlamaIndex - llama/llama_index.svg) | ![Release](https://img.shields.io/github/release/run-llama/llama_index) | ![Contributors](https://img.shields.io/github/contributors/run-llama/llama_index) | LlamaIndex is a data framework for your LLM applications | |
    • Mem0
    • Open WebUI - webui/open-webui.svg) | ![Release](https://img.shields.io/github/release/open-webui/open-webui) | ![Contributors](https://img.shields.io/github/contributors/open-webui/open-webui) | User-friendly WebUI for LLMs (Formerly Ollama WebUI) | |
    • PrivateGPUT - ai/private-gpt.svg) | ![Release](https://img.shields.io/github/release/zylon-ai/private-gpt) | ![Contributors](https://img.shields.io/github/contributors/zylon-ai/private-gpt) | Interact with your documents using the power of GPT, 100% privately, no data leaks | |
    • Swift - swift) | ![GitHub Release](https://img.shields.io/github/v/release/modelscope/ms-swift) | ![GitHub contributors](https://img.shields.io/github/contributors/modelscope/ms-swift) | SWIFT supports training(PreTraining/Fine-tuning/RLHF), inference, evaluation and deployment of 350+ LLMs and 90+ MLLMs (multimodal large models). | |
    • BentoML - Grade AI Applications | |
    • Dify
    • FastChat - sys/fastchat.svg) | ![Release](https://img.shields.io/github/release/lm-sys/fastchat) | ![Contributors](https://img.shields.io/github/contributors/lm-sys/fastchat) | An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena. | |
    • Flowise
    • Haystack - ai/haystack.svg) | ![Release](https://img.shields.io/github/release/deepset-ai/haystack) | ![Contributors](https://img.shields.io/github/contributors/deepset-ai/haystack) | πŸ” LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots. | |
    • LangChain - ai/langchain.svg) | ![Release](https://img.shields.io/github/release/langchain-ai/langchain) | ![Contributors](https://img.shields.io/github/contributors/langchain-ai/langchain) | ⚑ Building applications with LLMs through composability ⚑ | |
    • LiteLLM - Azure, OpenAI, Cohere, Anthropic, Replicate. Manages input/output translation | |
    • LLaMa-Factory - factory.svg) | ![Release](https://img.shields.io/github/release/hiyouga/llama-factory) | ![Contributors](https://img.shields.io/github/contributors/hiyouga/llama-factory) | Easy-to-use LLM fine-tuning framework (LLaMA, BLOOM, Mistral, Baichuan, Qwen, ChatGLM) | |
    • LlamaIndex - llama/llama_index.svg) | ![Release](https://img.shields.io/github/release/run-llama/llama_index) | ![Contributors](https://img.shields.io/github/contributors/run-llama/llama_index) | LlamaIndex is a data framework for your LLM applications | |
    • Mem0
    • PrivateGPUT - ai/private-gpt.svg) | ![Release](https://img.shields.io/github/release/zylon-ai/private-gpt) | ![Contributors](https://img.shields.io/github/contributors/zylon-ai/private-gpt) | Interact with your documents using the power of GPT, 100% privately, no data leaks | |
    • Swift - swift) | ![GitHub Release](https://img.shields.io/github/v/release/modelscope/ms-swift) | ![GitHub contributors](https://img.shields.io/github/contributors/modelscope/ms-swift) | SWIFT supports training(PreTraining/Fine-tuning/RLHF), inference, evaluation and deployment of 350+ LLMs and 90+ MLLMs (multimodal large models). | |
  • MLOps

    • Flyte
    • Kubeflow
    • Metaflow - life data science projects with ease! | |
    • MLflow
    • Seldon-Core - core.svg) | ![Release](https://img.shields.io/github/release/SeldonIO/seldon-core) | ![Contributors](https://img.shields.io/github/contributors/SeldonIO/seldon-core) | An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models. | cloud |
    • Flyte
    • Kubeflow
    • Metaflow - life data science projects with ease! | |
    • MLflow
    • Seldon-Core - core.svg) | ![Release](https://img.shields.io/github/release/SeldonIO/seldon-core) | ![Contributors](https://img.shields.io/github/contributors/SeldonIO/seldon-core) | An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models. | cloud |
    • ZenML - io/zenml.svg) | ![Release](https://img.shields.io/github/release/zenml-io/zenml) | ![Contributors](https://img.shields.io/github/contributors/zenml-io/zenml) | ZenML πŸ™: Build portable, production-ready MLOps pipelines. <https://zenml.io>. | |
    • ZenML - io/zenml.svg) | ![Release](https://img.shields.io/github/release/zenml-io/zenml) | ![Contributors](https://img.shields.io/github/contributors/zenml-io/zenml) | ZenML πŸ™: Build portable, production-ready MLOps pipelines. <https://zenml.io>. | |
  • Inference

    • DeepSpeed-MII - mii.svg) | ![Release](https://img.shields.io/github/release/microsoft/deepspeed-mii) | ![Contributors](https://img.shields.io/github/contributors/microsoft/deepspeed-mii) | MII makes low-latency and high-throughput inference possible, powered by DeepSpeed. | |
    • Inference - to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models. | vision |
    • ipex-llm - analytics/ipex-llm.svg) | ![Release](https://img.shields.io/github/release/intel-analytics/ipex-llm) | ![Contributors](https://img.shields.io/github/contributors/intel-analytics/ipex-llm) | Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc. | device |
    • llmaz - of-the-art LLMs on Kubernetes. | |
    • LMDeploy
    • MaxText
    • LMDeploy
    • DeepSpeed-MII - mii.svg) | ![Release](https://img.shields.io/github/release/microsoft/deepspeed-mii) | ![Contributors](https://img.shields.io/github/contributors/microsoft/deepspeed-mii) | MII makes low-latency and high-throughput inference possible, powered by DeepSpeed. | |
    • MaxText
    • Inference - to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models. | vision |
    • llama.cpp
    • MInference - context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy. | |
    • ipex-llm - analytics/ipex-llm.svg) | ![Release](https://img.shields.io/github/release/intel-analytics/ipex-llm) | ![Contributors](https://img.shields.io/github/contributors/intel-analytics/ipex-llm) | Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc. | device |
    • llmaz - of-the-art LLMs on Kubernetes. | |
    • MLC LLM - ai/mlc-llm.svg) | ![Release](https://img.shields.io/github/release/mlc-ai/mlc-llm) | ![Contributors](https://img.shields.io/github/contributors/mlc-ai/mlc-llm) | Universal LLM Deployment Engine with ML Compilation | |
    • Nanoflow - oriented high-performance serving framework for LLMs | |
    • Ollama
    • OpenLLM
    • Ratchet - platform browser ML framework. | browser |
    • RayServe - project/ray.svg) | ![Release](https://img.shields.io/github/release/ray-project/ray) | ![Contributors](https://img.shields.io/github/contributors/ray-project/ray) | Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. | |
    • RouteLLM - sys/routellm.svg) | ![Release](https://img.shields.io/github/release/lm-sys/routellm) | ![Contributors](https://img.shields.io/github/contributors/lm-sys/routellm) | A framework for serving and evaluating LLM routers - save LLM costs without compromising quality. | cost |
    • SGLang - project/sglang.svg) | ![Release](https://img.shields.io/github/release/sgl-project/sglang) | ![Contributors](https://img.shields.io/github/contributors/sgl-project/sglang) | SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable. | |
    • transformers.js - of-the-art Machine Learning for the web. Run πŸ€— Transformers directly in your browser, with no need for a server! | browser |
    • llama.cpp
    • MInference - context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy. | |
    • MLC LLM - ai/mlc-llm.svg) | ![Release](https://img.shields.io/github/release/mlc-ai/mlc-llm) | ![Contributors](https://img.shields.io/github/contributors/mlc-ai/mlc-llm) | Universal LLM Deployment Engine with ML Compilation | |
    • MLServer
    • Nanoflow - oriented high-performance serving framework for LLMs | |
    • Triton Inference Server - inference-server/server.svg) | ![Release](https://img.shields.io/github/release/triton-inference-server/server) | ![Contributors](https://img.shields.io/github/contributors/triton-inference-server/server) | The Triton Inference Server provides an optimized cloud and edge inferencing solution. | |
    • Ollama
    • OpenLLM
    • OpenVINO - source toolkit for optimizing and deploying AI inference | |
    • Ratchet - platform browser ML framework. | browser |
    • SGLang - project/sglang.svg) | ![Release](https://img.shields.io/github/release/sgl-project/sglang) | ![Contributors](https://img.shields.io/github/contributors/sgl-project/sglang) | SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable. | |
    • transformers.js - of-the-art Machine Learning for the web. Run πŸ€— Transformers directly in your browser, with no need for a server! | browser |
    • Triton Inference Server - inference-server/server.svg) | ![Release](https://img.shields.io/github/release/triton-inference-server/server) | ![Contributors](https://img.shields.io/github/contributors/triton-inference-server/server) | The Triton Inference Server provides an optimized cloud and edge inferencing solution. | |
    • RayServe - project/ray.svg) | ![Release](https://img.shields.io/github/release/ray-project/ray) | ![Contributors](https://img.shields.io/github/contributors/ray-project/ray) | Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. | |
    • RouteLLM - sys/routellm.svg) | ![Release](https://img.shields.io/github/release/lm-sys/routellm) | ![Contributors](https://img.shields.io/github/contributors/lm-sys/routellm) | A framework for serving and evaluating LLM routers - save LLM costs without compromising quality. | cost |
    • TensorRT-LLM - LLM) | ![GitHub Release](https://img.shields.io/github/v/release/NVIDIA/TensorRT-LLM) | ![GitHub contributors](https://img.shields.io/github/contributors/NVIDIA/TensorRT-LLM) | TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.||
    • vLLM - project/vllm.svg) | ![Release](https://img.shields.io/github/release/vllm-project/vllm) | ![Contributors](https://img.shields.io/github/contributors/vllm-project/vllm) | A high-throughput and memory-efficient inference and serving engine for LLMs | |
    • web-llm - ai/web-llm.svg) | ![Release](https://img.shields.io/github/release/mlc-ai/web-llm) | ![Contributors](https://img.shields.io/github/contributors/mlc-ai/web-llm) | A high-throughput and memory-efficient inference and serving engine for LLMs | browser |
    • zml
    • TensorRT-LLM - LLM) | ![GitHub Release](https://img.shields.io/github/v/release/NVIDIA/TensorRT-LLM) | ![GitHub contributors](https://img.shields.io/github/contributors/NVIDIA/TensorRT-LLM) | TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.||
    • Text Generation Inference - generation-inference.svg) | ![Release](https://img.shields.io/github/release/huggingface/text-generation-inference) | ![Contributors](https://img.shields.io/github/contributors/huggingface/text-generation-inference) | Large Language Model Text Generation Inference | |
    • vLLM - project/vllm.svg) | ![Release](https://img.shields.io/github/release/vllm-project/vllm) | ![Contributors](https://img.shields.io/github/contributors/vllm-project/vllm) | A high-throughput and memory-efficient inference and serving engine for LLMs | |
    • web-llm - ai/web-llm.svg) | ![Release](https://img.shields.io/github/release/mlc-ai/web-llm) | ![Contributors](https://img.shields.io/github/contributors/mlc-ai/web-llm) | A high-throughput and memory-efficient inference and serving engine for LLMs | browser |
    • zml
  • Training

    • ColossalAI
    • Ludwig - ai/ludwig.svg) | ![Release](https://img.shields.io/github/release/ludwig-ai/ludwig) | ![Contributors](https://img.shields.io/github/contributors/ludwig-ai/ludwig) | Low-code framework for building custom LLMs, neural networks, and other AI models | |
    • MLX - explore/mlx.svg) | ![Release](https://img.shields.io/github/release/ml-explore/mlx) | ![Contributors](https://img.shields.io/github/contributors/ml-explore/mlx) | MLX: An array framework for Apple silicon | |
    • ColossalAI
    • Ludwig - ai/ludwig.svg) | ![Release](https://img.shields.io/github/release/ludwig-ai/ludwig) | ![Contributors](https://img.shields.io/github/contributors/ludwig-ai/ludwig) | Low-code framework for building custom LLMs, neural networks, and other AI models | |
    • MLX - explore/mlx.svg) | ![Release](https://img.shields.io/github/release/ml-explore/mlx) | ![Contributors](https://img.shields.io/github/contributors/ml-explore/mlx) | MLX: An array framework for Apple silicon | |
  • FineTune

    • Axolotl - ai-cloud/axolotl.svg) | ![Release](https://img.shields.io/github/release/axolotl-ai-cloud/axolotl) | ![Contributors](https://img.shields.io/github/contributors/axolotl-ai-cloud/axolotl) | Go ahead and axolotl questions | |
    • torchtune - PyTorch Library for LLM Fine-tuning | |
    • unsloth - 5x faster with 80% less memory | |
    • Axolotl - ai-cloud/axolotl.svg) | ![Release](https://img.shields.io/github/release/axolotl-ai-cloud/axolotl) | ![Contributors](https://img.shields.io/github/contributors/axolotl-ai-cloud/axolotl) | Go ahead and axolotl questions | |
    • torchtune - PyTorch Library for LLM Fine-tuning | |
    • unsloth - 5x faster with 80% less memory | |
  • Agent

    • AutoGPT - Gravitas/AutoGPT.svg) | ![Release](https://img.shields.io/github/release/Significant-Gravitas/AutoGPT) | ![Contributors](https://img.shields.io/github/contributors/Significant-Gravitas/AutoGPT) | An experimental open-source attempt to make GPT-4 fully autonomous. | |
    • MetaGPT - Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo | |
    • PydanticAI - ai.svg) | ![Release](https://img.shields.io/github/release/pydantic/pydantic-ai) | ![Contributors](https://img.shields.io/github/contributors/pydantic/pydantic-ai) | Agent Framework / shim to use Pydantic with LLMs | |
    • Swarm - agent systems. Managed by OpenAI Solutions team. Experimental framework. | |
    • XAgent
    • AutoGPT - Gravitas/AutoGPT.svg) | ![Release](https://img.shields.io/github/release/Significant-Gravitas/AutoGPT) | ![Contributors](https://img.shields.io/github/contributors/Significant-Gravitas/AutoGPT) | An experimental open-source attempt to make GPT-4 fully autonomous. | |
    • MetaGPT - Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo | |
    • PydanticAI - ai.svg) | ![Release](https://img.shields.io/github/release/pydantic/pydantic-ai) | ![Contributors](https://img.shields.io/github/contributors/pydantic/pydantic-ai) | Agent Framework / shim to use Pydantic with LLMs | |
    • Swarm - agent systems. Managed by OpenAI Solutions team. Experimental framework. | |
    • XAgent
  • Evaluation

    • AgentBench
    • lm-evaluation-harness - evaluation-harness.svg) | ![Release](https://img.shields.io/github/release/EleutherAI/lm-evaluation-harness) | ![Contributors](https://img.shields.io/github/contributors/EleutherAI/lm-evaluation-harness) | A framework for few-shot evaluation of language models. | |
    • LongBench - context |
    • AgentBench
    • lm-evaluation-harness - evaluation-harness.svg) | ![Release](https://img.shields.io/github/release/EleutherAI/lm-evaluation-harness) | ![Contributors](https://img.shields.io/github/contributors/EleutherAI/lm-evaluation-harness) | A framework for few-shot evaluation of language models. | |
  • DB Store

    • chroma - core/chroma.svg) | ![Release](https://img.shields.io/github/release/chroma-core/chroma) | ![Contributors](https://img.shields.io/github/contributors/chroma-core/chroma) | the AI-native open-source embedding database | vector |
    • deeplake - time to PyTorch/TensorFlow. <https://activeloop.ai> | |
    • Faiss
    • milvus - io/milvus.svg) | ![Release](https://img.shields.io/github/release/milvus-io/milvus) | ![Contributors](https://img.shields.io/github/contributors/milvus-io/milvus) | A cloud-native vector database, storage for next generation AI applications | cloud,vector |
    • deeplake - time to PyTorch/TensorFlow. <https://activeloop.ai> | |
    • chroma - core/chroma.svg) | ![Release](https://img.shields.io/github/release/chroma-core/chroma) | ![Contributors](https://img.shields.io/github/contributors/chroma-core/chroma) | the AI-native open-source embedding database | vector |
    • Faiss
    • milvus - io/milvus.svg) | ![Release](https://img.shields.io/github/release/milvus-io/milvus) | ![Contributors](https://img.shields.io/github/contributors/milvus-io/milvus) | A cloud-native vector database, storage for next generation AI applications | cloud,vector |
    • weaviate - source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​. | cloud,vector |
    • weaviate - source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​. | cloud,vector |
  • Observation

    • OpenLLMetry - source observability for your LLM application, based on OpenTelemetry | |
    • Helicone AI - source LangSmith alternative for logging, monitoring, and debugging AI applications.| |
    • phoenix - ai/phoenix.svg) | ![Release](https://img.shields.io/github/release/arize-ai/phoenix) | ![Contributors](https://img.shields.io/github/contributors/arize-ai/phoenix) | ML Observability in a Notebook - Uncover Insights, Surface Problems, Monitor, and Fine Tune your Generative LLM, CV and Tabular Models | |
    • wandb
    • OpenLLMetry - source observability for your LLM application, based on OpenTelemetry | |
    • Helicone AI - source LangSmith alternative for logging, monitoring, and debugging AI applications.| |
    • phoenix - ai/phoenix.svg) | ![Release](https://img.shields.io/github/release/arize-ai/phoenix) | ![Contributors](https://img.shields.io/github/contributors/arize-ai/phoenix) | ML Observability in a Notebook - Uncover Insights, Surface Problems, Monitor, and Fine Tune your Generative LLM, CV and Tabular Models | |
    • wandb
  • Alignment

    • OpenRLHF - to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral) | |
    • Self-RLHF - Alignment/safe-rlhf.svg) | ![Release](https://img.shields.io/github/release/PKU-Alignment/safe-rlhf) | ![Contributors](https://img.shields.io/github/contributors/PKU-Alignment/safe-rlhf) | Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback | |
    • OpenRLHF - to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral) | |
    • Self-RLHF - Alignment/safe-rlhf.svg) | ![Release](https://img.shields.io/github/release/PKU-Alignment/safe-rlhf) | ![Contributors](https://img.shields.io/github/contributors/PKU-Alignment/safe-rlhf) | Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback | |
  • Outputs

    • Instructor
    • Outlines - dev/outlines.svg) | ![Release](https://img.shields.io/github/release/outlines-dev/outlines) | ![Contributors](https://img.shields.io/github/contributors/outlines-dev/outlines) | Structured Text Generation | |
    • Instructor
    • Outlines - dev/outlines.svg) | ![Release](https://img.shields.io/github/release/outlines-dev/outlines) | ![Contributors](https://img.shields.io/github/contributors/outlines-dev/outlines) | Structured Text Generation | |