llmops

🚀 The Ultimate Curated List of LLMOps Tools, Frameworks, and Resources - A comprehensive collection of the best tools for Large Language Model Operations
https://github.com/pmady/llmops

Last synced: 14 days ago
JSON representation

Acknowledgments
- What to Contribute
Data Management
- Resources
  - DVC - square) |
  - LakeFS - square) |
  - Pachyderm - square) |
  - Delta Lake - io/delta?style=flat-square) |
Development Tools
- IDEs & Code Assistants
  - GitHub Copilot
  - Continue - source AI code assistant | ![Stars](https://img.shields.io/github/stars/continuedev/continue?style=flat-square) |
  - Cody
  - Tabby - hosted AI coding assistant | ![Stars](https://img.shields.io/github/stars/TabbyML/tabby?style=flat-square) |
- Notebooks & Workspaces
  - Jupyter - square) |
  - Google Colab
  - Gradient
Inference & Serving
- Inference Engines
  - vLLM - throughput and memory-efficient inference engine | ![Stars](https://img.shields.io/github/stars/vllm-project/vllm?style=flat-square) |
  - llama.cpp - square) |
  - TensorRT-LLM - LLM?style=flat-square) |
  - LMDeploy - square) |
  - DeepSpeed-MII - latency inference powered by DeepSpeed | ![Stars](https://img.shields.io/github/stars/microsoft/DeepSpeed-MII?style=flat-square) |
  - CTranslate2 - square) |
  - Cortex.cpp - square) |
  - LoRAX - LoRA inference server | ![Stars](https://img.shields.io/github/stars/predibase/lorax?style=flat-square) |
  - MInference - context LLM inference | ![Stars](https://img.shields.io/github/stars/microsoft/minference?style=flat-square) |
  - ipex-llm - analytics/ipex-llm?style=flat-square) |
- Inference Platforms
  - Ollama - square) |
  - LocalAI - compatible API for local models | ![Stars](https://img.shields.io/github/stars/mudler/LocalAI?style=flat-square) |
  - LM Studio
  - GPUStack - square) |
  - OpenLLM - square) |
  - Ray Serve - project/ray?style=flat-square) |
- Model Serving Frameworks
  - BentoML - square) |
  - Triton Inference Server - inference-server/server?style=flat-square) |
  - TorchServe - square) |
  - TensorFlow Serving - square) |
  - Jina - ai/jina?style=flat-square) |
  - Mosec - square) |
  - Infinity - square) |
LLMOps Platforms
- Notebooks & Workspaces
  - Agenta - AI/agenta?style=flat-square) |
  - Dify - square) |
  - Pezzo - source LLMOps platform | ![Stars](https://img.shields.io/github/stars/pezzolabs/pezzo?style=flat-square) |
  - Humanloop
Models
- Audio Foundation Models
  - Whisper - square) |
- Large Language Models
  - LLaMA - square) | Research |
  - Mistral - performance open models from Mistral AI | ![Stars](https://img.shields.io/github/stars/mistralai/mistral-src?style=flat-square) | Apache 2.0 |
  - Gemma
  - Qwen - square) | Apache 2.0 |
  - DeepSeek - effective open-source LLMs | ![Stars](https://img.shields.io/github/stars/deepseek-ai/DeepSeek-LLM?style=flat-square) | MIT |
  - Phi
  - ChatGLM - 6B?style=flat-square) | Apache 2.0 |
  - Alpaca - following model | ![Stars](https://img.shields.io/github/stars/tatsu-lab/stanford_alpaca?style=flat-square) | Apache 2.0 |
  - Vicuna - tuning LLaMA | ![Stars](https://img.shields.io/github/stars/lm-sys/FastChat?style=flat-square) | Apache 2.0 |
  - BELLE - square) | Apache 2.0 |
  - Falcon - performance open models | N/A | Apache 2.0 |
  - Bloom - workshop/model_card?style=flat-square) | RAIL |
- Multimodal Models
  - LLaVA - liu/LLaVA?style=flat-square) |
  - MiniCPM-V - V?style=flat-square) |
  - Qwen-VL - language model from Alibaba | ![Stars](https://img.shields.io/github/stars/QwenLM/Qwen-VL?style=flat-square) |
Observability & Monitoring
- Resources
  - Phoenix - ai/phoenix?style=flat-square) |
  - Helicone - source LLM observability | ![Stars](https://img.shields.io/github/stars/Helicone/helicone?style=flat-square) |
  - OpenLIT - native LLM observability | ![Stars](https://img.shields.io/github/stars/openlit/openlit?style=flat-square) |
  - Evidently - square) |
  - DeepEval - ai/deepeval?style=flat-square) |
  - PostHog - square) |
Optimization & Performance
- Resources
  - ONNX Runtime - platform ML accelerator | ![Stars](https://img.shields.io/github/stars/microsoft/onnxruntime?style=flat-square) |
  - TVM - square) |
  - BitsAndBytes - bit optimizers and quantization | ![Stars](https://img.shields.io/github/stars/TimDettmers/bitsandbytes?style=flat-square) |
  - AutoGPTQ - to-use LLM quantization | ![Stars](https://img.shields.io/github/stars/PanQiWei/AutoGPTQ?style=flat-square) |
  - GPTQ-for-LLaMa - bit quantization for LLaMA | ![Stars](https://img.shields.io/github/stars/qwopqwop200/GPTQ-for-LLaMa?style=flat-square) |
Orchestration
- Agent Frameworks
  - AutoGPT - Gravitas/AutoGPT?style=flat-square) |
  - CrewAI - square) |
  - AutoGen - agent conversation framework | ![Stars](https://img.shields.io/github/stars/microsoft/autogen?style=flat-square) |
  - LangGraph - actor applications | ![Stars](https://img.shields.io/github/stars/langchain-ai/langgraph?style=flat-square) |
  - AgentMark - safe Markdown-based agents | ![Stars](https://img.shields.io/github/stars/puzzlet-ai/agentmark?style=flat-square) |
- Application Frameworks
  - LangChain - ai/langchain?style=flat-square) |
  - LlamaIndex - llama/llama_index?style=flat-square) |
  - Haystack - to-end NLP framework | ![Stars](https://img.shields.io/github/stars/deepset-ai/haystack?style=flat-square) |
  - Semantic Kernel - kernel?style=flat-square) |
  - Langfuse - source LLM engineering platform | ![Stars](https://img.shields.io/github/stars/langfuse/langfuse?style=flat-square) |
  - Neurolink - square) |
- Workflow Management
  - Prefect - square) |
  - Airflow - square) |
  - Flyte - native workflow automation | ![Stars](https://img.shields.io/github/stars/flyteorg/flyte?style=flat-square) |
  - Flowise - square) |
Prompt Engineering
- Resources
  - Prompt Leaking Guide
  - Prefix-Tuning Paper
- Tools & Platforms
Resources & Learning
- Awesome Lists
  - Awesome LLM - Curated list of LLM resources
  - Awesome ChatGPT Prompts - Prompt examples
  - Awesome AI Agents - AI agent resources
  - Awesome LangChain - LangChain resources
- Documentation & Guides
  - OpenAI Cookbook - Examples and guides for OpenAI API
  - LLM University - Cohere's LLM learning resources
  - Full Stack LLM Bootcamp - Comprehensive LLM course
- Papers & Research
Security & Safety
- Resources
  - NeMo Guardrails - Guardrails?style=flat-square) |
  - Guardrails AI - ai/guardrails?style=flat-square) |
  - LLM Guard - guard?style=flat-square) |
  - Rebuff - square) |
  - LangKit - square) |
Star History
- What to Contribute
  - ![Star History Chart - history.com/#pmady/llmops&Date)
  - ![Star History Chart - history.com/#pmady/llmops&Date)
Training & Fine-Tuning
- Experiment Tracking
  - Weights & Biases - square) |
  - MLflow - source ML lifecycle platform | ![Stars](https://img.shields.io/github/stars/mlflow/mlflow?style=flat-square) |
  - TensorBoard - square) |
  - Aim - to-use experiment tracker | ![Stars](https://img.shields.io/github/stars/aimhubio/aim?style=flat-square) |
- Fine-Tuning Tools
  - LLaMA-Factory - tuning framework | ![Stars](https://img.shields.io/github/stars/hiyouga/LLaMA-Factory?style=flat-square) |
  - PEFT - Efficient Fine-Tuning | ![Stars](https://img.shields.io/github/stars/huggingface/peft?style=flat-square) |
  - Unsloth - tuning | ![Stars](https://img.shields.io/github/stars/unslothai/unsloth?style=flat-square) |
  - TRL - square) |
  - LitGPT - tune, deploy LLMs | ![Stars](https://img.shields.io/github/stars/Lightning-AI/litgpt?style=flat-square) |
  - Axolotl - tuning | ![Stars](https://img.shields.io/github/stars/OpenAccess-AI-Collective/axolotl?style=flat-square) |
- Training Frameworks
  - DeepSpeed - square) |
  - Megatron-LM - scale transformer training | ![Stars](https://img.shields.io/github/stars/NVIDIA/Megatron-LM?style=flat-square) |
  - Colossal-AI - square) |
  - Accelerate - square) |
Vector Search & RAG
- Resources
  - Chroma - native embedding database | ![Stars](https://img.shields.io/github/stars/chroma-core/chroma?style=flat-square) |
  - Weaviate - square) |
  - Qdrant - square) |
  - Milvus - native vector database | ![Stars](https://img.shields.io/github/stars/milvus-io/milvus?style=flat-square) |
  - Pinecone
  - FAISS - square) |
  - pgvector - square) |
  - LanceDB - friendly vector database | ![Stars](https://img.shields.io/github/stars/lancedb/lancedb?style=flat-square) |
What's New
- 🆕 Recently Added (January 2026)
  - Skypilot - Run LLMs on any cloud with one command
  - Modal - Serverless platform for AI/ML workloads
  - Ragas - Evaluation framework for RAG pipelines
  - PromptFoo - Test and evaluate LLM outputs
  - Phidata - Build AI assistants with memory and knowledge
  - Composio - Integration platform for AI agents
  - Traceloop - OpenTelemetry for LLMs
  - LangWatch - LLM monitoring and analytics

Programming Languages

Python 53 TypeScript 13 C++ 7 Go 6 Jupyter Notebook 4 Rust 3 HTML 2 Makefile 1 Java 1 Scala 1

llmops

Acknowledgments

What to Contribute

Data Management

Resources

Development Tools

IDEs & Code Assistants

Notebooks & Workspaces

Inference & Serving

Inference Engines

Inference Platforms

Model Serving Frameworks

LLMOps Platforms

Notebooks & Workspaces

Models

Audio Foundation Models

Large Language Models

Multimodal Models

Observability & Monitoring

Resources

Optimization & Performance

Resources

Orchestration

Agent Frameworks

Application Frameworks

Workflow Management

Prompt Engineering

Resources

Tools & Platforms

Resources & Learning

Awesome Lists

Documentation & Guides

Papers & Research

Security & Safety

Resources

Star History

What to Contribute

Training & Fine-Tuning

Experiment Tracking

Fine-Tuning Tools

Training Frameworks

Vector Search & RAG

Resources

What's New

🆕 Recently Added (January 2026)