llmops
🚀 The Ultimate Curated List of LLMOps Tools, Frameworks, and Resources - A comprehensive collection of the best tools for Large Language Model Operations
https://github.com/pmady/llmops
Last synced: about 13 hours ago
JSON representation
-
Development Tools
-
Notebooks & Workspaces
- Google Colab
- Gradient
- Jupyter - square) |
-
IDEs & Code Assistants
- GitHub Copilot
- Cody
- Continue - source AI code assistant |  |
- Tabby - hosted AI coding assistant |  |
- Cursor - first code editor | N/A |
-
-
LLMOps Platforms
-
Resources & Learning
-
Papers & Research
-
Documentation & Guides
- Full Stack LLM Bootcamp - Comprehensive LLM course
- OpenAI Cookbook - Examples and guides for OpenAI API
- LLM University - Cohere's LLM learning resources
-
Awesome Lists
- Awesome ChatGPT Prompts - Prompt examples
- Awesome AI Agents - AI agent resources
- Awesome LLM - Curated list of LLM resources
- Awesome LangChain - LangChain resources
-
-
Prompt Engineering
-
Tools & Platforms
-
Resources
-
-
Vector Search & RAG
-
Resources
- Pinecone
- pgvector - square) |
- Weaviate - square) |
- Milvus - native vector database |  |
- Qdrant - square) |
- Chroma - native embedding database |  |
- FAISS - square) |
- LanceDB - friendly vector database |  |
-
-
Models
-
Large Language Models
- Falcon - performance open models | N/A | Apache 2.0 |
- Gemma
- Phi
- Vicuna - tuning LLaMA |  | Apache 2.0 |
- Alpaca - following model |  | Apache 2.0 |
- BELLE - square) | Apache 2.0 |
- ChatGLM - 6B?style=flat-square) | Apache 2.0 |
- Qwen - square) | Apache 2.0 |
- DeepSeek - effective open-source LLMs |  | MIT |
- Bloom - workshop/model_card?style=flat-square) | RAIL |
- LLaMA - square) | Research |
- Mistral - performance open models from Mistral AI |  | Apache 2.0 |
-
Audio Foundation Models
- Whisper - square) |
-
Multimodal Models
-
-
Inference & Serving
-
Inference Platforms
-
Model Serving Frameworks
- BentoML - square) |
- TensorFlow Serving - square) |
- Infinity - square) |
- Triton Inference Server - inference-server/server?style=flat-square) |
- TorchServe - square) |
- Mosec - square) |
- Jina - ai/jina?style=flat-square) |
-
Inference Engines
- vLLM - throughput and memory-efficient inference engine |  |
- TensorRT-LLM - LLM?style=flat-square) |
- LMDeploy - square) |
- LoRAX - LoRA inference server |  |
- CTranslate2 - square) |
- Cortex.cpp - square) |
- MInference - context LLM inference |  |
- DeepSpeed-MII - latency inference powered by DeepSpeed |  |
- llama.cpp - square) |
-
-
What's New
-
🆕 Recently Added (January 2026)
- Modal - Serverless platform for AI/ML workloads
- PromptFoo - Test and evaluate LLM outputs
- Composio - Integration platform for AI agents
- Skypilot - Run LLMs on any cloud with one command
- Traceloop - OpenTelemetry for LLMs
- Ragas - Evaluation framework for RAG pipelines
- LangWatch - LLM monitoring and analytics
- Phidata - Build AI assistants with memory and knowledge
-
-
Acknowledgments
-
What to Contribute
-
-
Training & Fine-Tuning
-
Fine-Tuning Tools
- PEFT - Efficient Fine-Tuning |  |
- LLaMA-Factory - tuning framework |  |
- TRL - square) |
- Unsloth - tuning |  |
- LitGPT - tune, deploy LLMs |  |
-
Experiment Tracking
- MLflow - source ML lifecycle platform |  |
- TensorBoard - square) |
- Aim - to-use experiment tracker |  |
- Weights & Biases - square) |
-
Training Frameworks
- Colossal-AI - square) |
- Megatron-LM - scale transformer training |  |
- Accelerate - square) |
- DeepSpeed - square) |
- PyTorch FSDP
-
-
Orchestration
-
Application Frameworks
- LangChain - ai/langchain?style=flat-square) |
- LlamaIndex - llama/llama_index?style=flat-square) |
- Langfuse - source LLM engineering platform |  |
- Semantic Kernel - kernel?style=flat-square) |
- Haystack - to-end NLP framework |  |
- Neurolink - square) |
-
Agent Frameworks
- AutoGPT - Gravitas/AutoGPT?style=flat-square) |
- AutoGen - agent conversation framework |  |
- LangGraph - actor applications |  |
- AgentMark - safe Markdown-based agents |  |
- CrewAI - square) |
-
Workflow Management
-
-
Data Management
-
Resources
- Pachyderm - square) |
- Delta Lake - io/delta?style=flat-square) |
- LakeFS - square) |
- DVC - square) |
-
-
Optimization & Performance
-
Resources
- ONNX Runtime - platform ML accelerator |  |
- TVM - square) |
- GPTQ-for-LLaMa - bit quantization for LLaMA |  |
- BitsAndBytes - bit optimizers and quantization |  |
- AutoGPTQ - to-use LLM quantization |  |
-
-
Observability & Monitoring
-
Resources
- PostHog - square) |
- Evidently - square) |
- DeepEval - ai/deepeval?style=flat-square) |
- Phoenix - ai/phoenix?style=flat-square) |
- Helicone - source LLM observability |  |
- OpenLIT - native LLM observability |  |
- Lunary
-
-
Security & Safety
-
Resources
- NeMo Guardrails - Guardrails?style=flat-square) |
- Guardrails AI - ai/guardrails?style=flat-square) |
- Rebuff - square) |
- LangKit - square) |
- LLM Guard - guard?style=flat-square) |
-
-
Star History
-
What to Contribute
- ![Star History Chart - history.com/#pmady/llmops&Date)
-
Programming Languages
Categories
Inference & Serving
22
Models
16
Orchestration
15
Training & Fine-Tuning
14
Resources & Learning
12
Development Tools
8
Vector Search & RAG
8
What's New
8
Prompt Engineering
7
Observability & Monitoring
7
Security & Safety
5
Optimization & Performance
5
Data Management
4
LLMOps Platforms
4
Acknowledgments
3
Star History
1
License
1
Sub Categories
Resources
31
Large Language Models
12
Inference Engines
9
🆕 Recently Added (January 2026)
8
Model Serving Frameworks
7
Notebooks & Workspaces
7
Application Frameworks
6
Inference Platforms
6
Training Frameworks
5
Tools & Platforms
5
Fine-Tuning Tools
5
Papers & Research
5
Agent Frameworks
5
What to Contribute
5
IDEs & Code Assistants
5
Workflow Management
4
Awesome Lists
4
Experiment Tracking
4
Multimodal Models
3
Documentation & Guides
3
Audio Foundation Models
1
Keywords
llm
40
llmops
22
ai
21
machine-learning
21
python
18
deep-learning
17
openai
17
mlops
15
large-language-models
14
pytorch
13
prompt-engineering
13
llama
11
langchain
9
llms
9
data-science
9
observability
9
gpt
8
llm-serving
8
chatgpt
8
llm-evaluation
7
rag
7
agents
7
gpt-4
7
inference
7
llm-inference
7
evaluation
6
gpu
6
transformers
6
open-source
6
analytics
6
llama3
6
tensorflow
6
fine-tuning
6
lora
5
javascript
5
nearest-neighbor-search
5
mistral
5
prompt-management
5
artificial-intelligence
5
qwen
5
vector-database
5
deepseek
4
ml
4
chatbot
4
developer-tools
4
monitoring
4
kubernetes
4
workflow
4
typescript
4
golang
4