https://github.com/ethicals7s/awesome-local-ai
152 open-source tools to run LLMs 100% locally – no cloud, no API keys, no censorship
https://github.com/ethicals7s/awesome-local-ai
List: awesome-local-ai
awesome-list crewai exllama fine-tuning inference llama-cpp local-ai local-llm machine-learning-ai multi-modal offline-ai ollama private-gpt quantization rag-agents self-hosted vllm voice
Last synced: 6 months ago
JSON representation
152 open-source tools to run LLMs 100% locally – no cloud, no API keys, no censorship
- Host: GitHub
- URL: https://github.com/ethicals7s/awesome-local-ai
- Owner: ethicals7s
- License: cc0-1.0
- Created: 2025-11-30T14:07:16.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-11-30T14:08:32.000Z (7 months ago)
- Last Synced: 2025-12-08T21:23:14.690Z (7 months ago)
- Topics: awesome-list, crewai, exllama, fine-tuning, inference, llama-cpp, local-ai, local-llm, machine-learning-ai, multi-modal, offline-ai, ollama, private-gpt, quantization, rag-agents, self-hosted, vllm, voice
- Homepage:
- Size: 8.79 KB
- Stars: 6
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- ultimate-awesome - awesome-local-ai - 152 open-source tools to run LLMs 100% locally – no cloud, no API keys, no censorship. (Other Lists / TeX Lists)
README
# Awesome Local AI
[](https://awesome.re)
[](https://star-history.com/#ethicals7s/awesome-local-ai&Date)
> Curated list of the best **open-source** tools to run, fine-tune, and build with LLMs **100% locally** in 2025–2026
> No cloud · No API keys · No censorship — **152 tools with descriptions and growing**
**Star this repo to keep the ultimate local-AI toolbox at hand → updated weekly**
## One-Click Runners & Installers (15)
- [Ollama](https://ollama.com) – One-command runner for Llama 3, Gemma, Mistral, etc.
- [LM Studio](https://lmstudio.ai) – Beautiful GUI for discovering and chatting with local models
- [GPT4All](https://gpt4all.io) – Fully offline chat with 100+ quantized models
- [Jan](https://jan.ai) – Open-source ChatGPT alternative that runs locally
- [Llama.cpp](https://github.com/ggerganov/llama.cpp) – High-performance C++ inference engine (GGUF)
- [text-generation-webui](https://github.com/oobabooga/text-generation-webui) – Feature-rich web UI with LoRAs and extensions
- [AnythingLLM](https://anythingllm.com) – Local RAG + document chat workspace
- [PrivateGPT](https://github.com/imartinez/privateGPT) – Offline Q&A over your documents
- [KoboldCpp](https://github.com/LostRuins/koboldcpp) – Single-file GGUF runner with KoboldAI API
- [Pinokio](https://pinokio.computer) – One-click browser installer for AI apps
- [LocalAI](https://localai.io) – OpenAI API drop-in replacement for local models
- [Faraday.dev](https://faraday.dev) – Desktop character chat with local models
- [Tabby](https://github.com/TabbyML/tabby) – Self-hosted GitHub Copilot alternative
- [Cortex](https://github.com/janhq/cortex.cpp) – Embeddable multi-engine runner
- [LMDeploy](https://github.com/InternLM/lmdeploy) – Model compression and deployment toolkit
## Desktop & Web UIs (22)
- [Open WebUI](https://openwebui.com) – Official gorgeous frontend for Ollama
- [LobeChat](https://lobechat.com) – Modern multi-model chat UI with local backends
- [Chainlit](https://chainlit.io) – Build conversational AI apps fast
- [Gradio](https://gradio.app) – Instant web demos for any model
- [LoLLMS WebUI](https://github.com/ParisNeo/lollms) – All-in-one local LLM interface
- [SillyTavern](https://github.com/SillyTavern/SillyTavern) – Advanced roleplay chat UI
- [LibreChat](https://librechat.ai) – Multi-provider chat with local support
- [Continue.dev](https://continue.dev) – Local VSCode Copilot
- [Aider](https://aider.chat) – Terminal pair programmer with git integration
- [Open Interpreter](https://openinterpreter.com) – Run code and control your computer locally
- [ComfyUI](https://github.com/comfyanonymous/ComfyUI) – Node-based Stable Diffusion workflow
- [InvokeAI](https://invoke.ai) – Creative image generation UI
- [Fooocus](https://github.com/lllyasviel/Fooocus) – Simplified high-quality image generation
- [Draw Things](https://drawthings.ai) – macOS/iOS Stable Diffusion app
- [Msty](https://msty.app) – Minimalist local chat app
- [LlamaGPT](https://github.com/getumbrel/llama-gpt) – Self-hosted chat on Umbrel
- [Text Generation UI](https://github.com/oobabooga/text-generation-webui) – Versatile text gen web UI
- [Chatbot UI](https://github.com/mckaywrigley/chatbot-ui) – Clean self-hosted ChatGPT-like interface
- [HuggingChat](https://huggingface.co/chat) – Self-hosted version of HF chat
- [Taskyon](https://github.com/Xyntopia/taskyon) – Vue3-based local-first chat UI
- [QA-Pilot](https://github.com/reid41/QA-Pilot) – Interactive repo/file chat
- [Shell-Pilot](https://github.com/reid41/shell-pilot) – LLM-powered shell scripting
## Agent Frameworks (fully local) (25)
- [CrewAI](https://crewai.com) – Multi-agent orchestration framework
- [AutoGen](https://microsoft.github.io/autogen/) – Microsoft conversational multi-agent system
- [LangGraph](https://github.com/langchain-ai/langgraph) – Stateful multi-actor applications
- [BabyAGI](https://github.com/yoheinakajima/babyagi) – Task-driven autonomous agent
- [Auto-GPT](https://github.com/Significant-Gravitas/Auto-GPT) – Experimental autonomous GPT agent
- [GPT Engineer](https://github.com/AntonOsika/gpt-engineer) – Generate codebases from specifications
- [MetaGPT](https://github.com/geekan/MetaGPT) – Multi-agent software company simulation
- [SuperAGI](https://github.com/TransformerOptimus/SuperAGI) – Infrastructure for autonomous agents
- [Devon](https://github.com/entropy-research/Devon) – Open-source AI software engineer
- [Open Interpreter](https://github.com/KillianLucas/open-interpreter) – Natural language code execution
- [Aider](https://github.com/paul-gauthier/aider) – Git-aware pair programmer
- [Langflow](https://github.com/langflow-ai/langflow) – Visual LLM app builder
- [Flowise](https://flowiseai.com) – Drag-and-drop LLM flows (self-hosted)
- [Dify](https://dify.ai) – Open-source LLM app builder (self-hosted)
- [Haystack](https://haystack.deepset.ai) – End-to-end NLP pipelines
- [LlamaIndex](https://www.llamaindex.ai) – Data framework for LLM applications
- [Bisheng](https://github.com/dataelement/bisheng) – Low-code agent builder
- [Taskweaver](https://github.com/microsoft/TaskWeaver) – Code-first agent framework
- [XAgent](https://github.com/OpenBMB/XAgent) – Autonomous agent with tools
- [ChatDev](https://github.com/OpenBMB/ChatDev) – Collaborative software development agents
- [GodMode](https://github.com/smol-ai/godmode) – Prompt chaining for complex tasks
- [SmolAgents](https://github.com/smol-ai/agents) – Lightweight agent framework
- [Camel-AI](https://github.com/camel-ai/camel) – Communicative agents for role-playing
- [AgentGPT](https://github.com/reworkd/AgentGPT) – Browser-based autonomous agents (local mode)
- [PrivateGPT](https://github.com/imartinez/privateGPT) – Local agent for document querying
## RAG & Vector Databases (14)
- [Chroma](https://www.trychroma.com) – Lightweight embedded vector database
- [Weaviate](https://weaviate.io) – Open-source vector search engine
- [Qdrant](https://qdrant.tech) – High-performance filtered vector search
- [LanceDB](https://lancedb.com) – Serverless vector DB on Parquet
- [Milvus](https://milvus.io) – Scalable open-source vector database
- [Faiss](https://github.com/facebookresearch/faiss) – Facebook similarity search library
- [Pinecone](https://pinecone.io) – Self-hosted vector database
- [Vespa](https://vespa.ai) – Big data serving with vector search
- [Typesense](https://typesense.org) – Typo-tolerant search with vectors
- [Redis Vector Library](https://redis.io) – In-memory vector similarity
- [PGVector](https://github.com/pgvector/pgvector) – Postgres vector extension
- [DuckDB](https://duckdb.org) – In-process OLAP with vector support
- [SurrealDB](https://surrealdb.com) – Multi-model DB with vector indexing
- [Zilliz](https://zilliz.com) – Cloud-native vector platform (open components)
## Fine-tuning & Quantization (18)
- [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) – YAML-driven LoRA/QLoRA fine-tuning
- [Unsloth](https://github.com/unslothai/unsloth) – 2× faster fine-tuning on consumer GPUs
- [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) – Web UI for efficient fine-tuning
- [AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ) – GPTQ/AWQ quantization toolkit
- [PEFT](https://github.com/huggingface/peft) – Parameter-efficient fine-tuning methods
- [TRL](https://github.com/huggingface/trl) – RLHF, DPO, PPO training
- [Lit-GPT](https://github.com/Lightning-AI/lit-gpt) – Lightweight fine-tuning with PyTorch Lightning
- [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF) – Scalable RLHF framework
- [DeepSpeed](https://github.com/microsoft/DeepSpeed) – Deep learning optimization library
- [Colossal-AI](https://github.com/hpcaitech/ColossalAI) – Large model training system
- [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) – Efficient transformer training
- [BMTrain](https://github.com/OpenBMB/BMTrain) – Communication-efficient training
- [FSDP](https://pytorch.org/docs/stable/fsdp.html) – Fully Sharded Data Parallel
- [LoRAX](https://github.com/predibase/lorax) – Multi-LoRA serving
- [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes) – 8-bit optimizers and quantization
- [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) – 4-bit LLaMA quantization
- [ExLlama](https://github.com/turboderp/exllama) – Fast LLaMA inference with quantization
- [ExLlamaV2](https://github.com/turboderp/exllamav2) – Optimized quantized inference
## Voice & Multimodal (local) (16)
- [Whisper.cpp](https://github.com/ggerganov/whisper.cpp) – Fast local speech-to-text
- [Coqui TTS](https://github.com/coqui-ai/TTS) – Neural text-to-speech synthesis
- [OpenVoice](https://github.com/myshell-ai/OpenVoice) – Instant voice cloning
- [Silero Models](https://github.com/snakers4/silero-models) – Pre-trained TTS/STT models
- [LLaVA](https://llava-vl.github.io) – Vision + text multimodal chat
- [Moondream2](https://github.com/vikhyat/moondream) – Compact vision-language model
- [Bark](https://github.com/suno-ai/bark) – Text-to-audio with voice cloning
- [Audiocraft](https://github.com/facebookresearch/audiocraft) – Music and audio generation
- [RVC WebUI](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI) – Voice conversion
- [Tortoise TTS](https://github.com/neonbjb/tortoise-tts) – High-quality multi-voice TTS
- [VALL-E X](https://github.com/Plachtaa/VALL-E-X) – Zero-shot TTS from short audio
- [Piper TTS](https://github.com/rhasspy/piper) – Fast neural TTS
- [OpenTTS](https://github.com/synesthesiam/opentts) – Multi-speaker TTS
- [Kosmos-2](https://github.com/microsoft/unilm/tree/master/kosmos-2) – Grounded image-text model
- [ImageBind](https://github.com/facebookresearch/ImageBind) – Multimodal embedding across 6 modalities
- [CLIP](https://github.com/openai/CLIP) – Contrastive language-image pretraining
## Inference Engines & Backends (22)
- [vLLM](https://github.com/vllm-project/vllm) – High-throughput serving with PagedAttention
- [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) – NVIDIA-optimized low-latency inference
- [ExLlamaV2](https://github.com/turboderp/exllamav2) – Blazing-fast quantized inference
- [SGLang](https://github.com/sgl-project/sglang) – Structured generation language
- [MLX](https://github.com/ml-explore/mlx) – Apple Silicon-native framework
- [MLC LLM](https://github.com/mlc-ai/mlc-llm) – Universal deployment engine
- [llama.cpp](https://github.com/ggerganov/llama.cpp) – Lightweight C++ inference
- [ONNX Runtime](https://onnxruntime.ai) – Cross-platform ML accelerator
- [OpenVINO](https://github.com/openvinotoolkit/openvino) – Intel-optimized inference
- [TVM](https://tvm.apache.org) – End-to-end optimizing compiler
- [GGML](https://github.com/ggerganov/ggml) – Tensor library for ML
- [CTranslate2](https://github.com/OpenNMT/CTranslate2) – Fast inference engine
- [FasterTransformer](https://github.com/NVIDIA/FasterTransformer) – NVIDIA transformer decoder
- [TurboTransformers](https://github.com/Tencent/TurboTransformers) – Kernel fusion inference
- [LightLLM](https://github.com/ModelTC/lightllm) – Unified inference framework
- [DeepSpeed-Inference](https://github.com/microsoft/DeepSpeed) – Optimized transformer kernels
- [FlexFlow](https://github.com/flexflow/FlexFlow) – Distributed deep learning
- [Ray Serve](https://docs.ray.io/en/latest/serve) – Scalable model serving
- [BentoML](https://bentoml.com) – ML model serving framework
- [Triton Inference Server](https://github.com/triton-inference-server/server) – Multi-framework serving
- [OpenPPL](https://github.com/openppl-public/ppl.nn) – Neural network inference engine
- [llama.rs](https://github.com/rustformers/llama-rs) – Rust bindings for llama.cpp
## Contribute
Found something missing? → Open a PR! Let’s get to 200+ together
Last updated: December 1, 2025
Made with ❤️ by [@ethicals7s](https://github.com/ethicals7s)