https://github.com/ethicals7s/awesome-local-ai

152 open-source tools to run LLMs 100% locally – no cloud, no API keys, no censorship
https://github.com/ethicals7s/awesome-local-ai

List: awesome-local-ai

awesome-list crewai exllama fine-tuning inference llama-cpp local-ai local-llm machine-learning-ai multi-modal offline-ai ollama private-gpt quantization rag-agents self-hosted vllm voice

Last synced: 6 months ago
JSON representation

152 open-source tools to run LLMs 100% locally – no cloud, no API keys, no censorship

Host: GitHub
URL: https://github.com/ethicals7s/awesome-local-ai
Owner: ethicals7s
License: cc0-1.0
Created: 2025-11-30T14:07:16.000Z (7 months ago)
Default Branch: main
Last Pushed: 2025-11-30T14:08:32.000Z (7 months ago)
Last Synced: 2025-12-08T21:23:14.690Z (7 months ago)
Topics: awesome-list, crewai, exllama, fine-tuning, inference, llama-cpp, local-ai, local-llm, machine-learning-ai, multi-modal, offline-ai, ollama, private-gpt, quantization, rag-agents, self-hosted, vllm, voice
Homepage:
Size: 8.79 KB
Stars: 6
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

ultimate-awesome - awesome-local-ai - 152 open-source tools to run LLMs 100% locally – no cloud, no API keys, no censorship. (Other Lists / TeX Lists)

README

          # Awesome Local AI

[![Awesome](https://awesome.re/badge.svg)](https://awesome.re)

[![Star History](https://api.star-history.com/svg?repos=ethicals7s/awesome-local-ai&type=Date)](https://star-history.com/#ethicals7s/awesome-local-ai&Date)

> Curated list of the best **open-source** tools to run, fine-tune, and build with LLMs **100% locally** in 2025–2026  

> No cloud · No API keys · No censorship — **152 tools with descriptions and growing**

**Star this repo to keep the ultimate local-AI toolbox at hand → updated weekly**

## One-Click Runners & Installers (15)

- [Ollama](https://ollama.com) – One-command runner for Llama 3, Gemma, Mistral, etc.

- [LM Studio](https://lmstudio.ai) – Beautiful GUI for discovering and chatting with local models

- [GPT4All](https://gpt4all.io) – Fully offline chat with 100+ quantized models

- [Jan](https://jan.ai) – Open-source ChatGPT alternative that runs locally

- [Llama.cpp](https://github.com/ggerganov/llama.cpp) – High-performance C++ inference engine (GGUF)

- [text-generation-webui](https://github.com/oobabooga/text-generation-webui) – Feature-rich web UI with LoRAs and extensions

- [AnythingLLM](https://anythingllm.com) – Local RAG + document chat workspace

- [PrivateGPT](https://github.com/imartinez/privateGPT) – Offline Q&A over your documents

- [KoboldCpp](https://github.com/LostRuins/koboldcpp) – Single-file GGUF runner with KoboldAI API

- [Pinokio](https://pinokio.computer) – One-click browser installer for AI apps

- [LocalAI](https://localai.io) – OpenAI API drop-in replacement for local models

- [Faraday.dev](https://faraday.dev) – Desktop character chat with local models

- [Tabby](https://github.com/TabbyML/tabby) – Self-hosted GitHub Copilot alternative

- [Cortex](https://github.com/janhq/cortex.cpp) – Embeddable multi-engine runner

- [LMDeploy](https://github.com/InternLM/lmdeploy) – Model compression and deployment toolkit

## Desktop & Web UIs (22)

- [Open WebUI](https://openwebui.com) – Official gorgeous frontend for Ollama

- [LobeChat](https://lobechat.com) – Modern multi-model chat UI with local backends

- [Chainlit](https://chainlit.io) – Build conversational AI apps fast

- [Gradio](https://gradio.app) – Instant web demos for any model

- [LoLLMS WebUI](https://github.com/ParisNeo/lollms) – All-in-one local LLM interface

- [SillyTavern](https://github.com/SillyTavern/SillyTavern) – Advanced roleplay chat UI

- [LibreChat](https://librechat.ai) – Multi-provider chat with local support

- [Continue.dev](https://continue.dev) – Local VSCode Copilot

- [Aider](https://aider.chat) – Terminal pair programmer with git integration

- [Open Interpreter](https://openinterpreter.com) – Run code and control your computer locally

- [ComfyUI](https://github.com/comfyanonymous/ComfyUI) – Node-based Stable Diffusion workflow

- [InvokeAI](https://invoke.ai) – Creative image generation UI

- [Fooocus](https://github.com/lllyasviel/Fooocus) – Simplified high-quality image generation

- [Draw Things](https://drawthings.ai) – macOS/iOS Stable Diffusion app

- [Msty](https://msty.app) – Minimalist local chat app

- [LlamaGPT](https://github.com/getumbrel/llama-gpt) – Self-hosted chat on Umbrel

- [Text Generation UI](https://github.com/oobabooga/text-generation-webui) – Versatile text gen web UI

- [Chatbot UI](https://github.com/mckaywrigley/chatbot-ui) – Clean self-hosted ChatGPT-like interface

- [HuggingChat](https://huggingface.co/chat) – Self-hosted version of HF chat

- [Taskyon](https://github.com/Xyntopia/taskyon) – Vue3-based local-first chat UI

- [QA-Pilot](https://github.com/reid41/QA-Pilot) – Interactive repo/file chat

- [Shell-Pilot](https://github.com/reid41/shell-pilot) – LLM-powered shell scripting

## Agent Frameworks (fully local) (25)

- [CrewAI](https://crewai.com) – Multi-agent orchestration framework

- [AutoGen](https://microsoft.github.io/autogen/) – Microsoft conversational multi-agent system

- [LangGraph](https://github.com/langchain-ai/langgraph) – Stateful multi-actor applications

- [BabyAGI](https://github.com/yoheinakajima/babyagi) – Task-driven autonomous agent

- [Auto-GPT](https://github.com/Significant-Gravitas/Auto-GPT) – Experimental autonomous GPT agent

- [GPT Engineer](https://github.com/AntonOsika/gpt-engineer) – Generate codebases from specifications

- [MetaGPT](https://github.com/geekan/MetaGPT) – Multi-agent software company simulation

- [SuperAGI](https://github.com/TransformerOptimus/SuperAGI) – Infrastructure for autonomous agents

- [Devon](https://github.com/entropy-research/Devon) – Open-source AI software engineer

- [Open Interpreter](https://github.com/KillianLucas/open-interpreter) – Natural language code execution

- [Aider](https://github.com/paul-gauthier/aider) – Git-aware pair programmer

- [Langflow](https://github.com/langflow-ai/langflow) – Visual LLM app builder

- [Flowise](https://flowiseai.com) – Drag-and-drop LLM flows (self-hosted)

- [Dify](https://dify.ai) – Open-source LLM app builder (self-hosted)

- [Haystack](https://haystack.deepset.ai) – End-to-end NLP pipelines

- [LlamaIndex](https://www.llamaindex.ai) – Data framework for LLM applications

- [Bisheng](https://github.com/dataelement/bisheng) – Low-code agent builder

- [Taskweaver](https://github.com/microsoft/TaskWeaver) – Code-first agent framework

- [XAgent](https://github.com/OpenBMB/XAgent) – Autonomous agent with tools

- [ChatDev](https://github.com/OpenBMB/ChatDev) – Collaborative software development agents

- [GodMode](https://github.com/smol-ai/godmode) – Prompt chaining for complex tasks

- [SmolAgents](https://github.com/smol-ai/agents) – Lightweight agent framework

- [Camel-AI](https://github.com/camel-ai/camel) – Communicative agents for role-playing

- [AgentGPT](https://github.com/reworkd/AgentGPT) – Browser-based autonomous agents (local mode)

- [PrivateGPT](https://github.com/imartinez/privateGPT) – Local agent for document querying

## RAG & Vector Databases (14)

- [Chroma](https://www.trychroma.com) – Lightweight embedded vector database

- [Weaviate](https://weaviate.io) – Open-source vector search engine

- [Qdrant](https://qdrant.tech) – High-performance filtered vector search

- [LanceDB](https://lancedb.com) – Serverless vector DB on Parquet

- [Milvus](https://milvus.io) – Scalable open-source vector database

- [Faiss](https://github.com/facebookresearch/faiss) – Facebook similarity search library

- [Pinecone](https://pinecone.io) – Self-hosted vector database

- [Vespa](https://vespa.ai) – Big data serving with vector search

- [Typesense](https://typesense.org) – Typo-tolerant search with vectors

- [Redis Vector Library](https://redis.io) – In-memory vector similarity

- [PGVector](https://github.com/pgvector/pgvector) – Postgres vector extension

- [DuckDB](https://duckdb.org) – In-process OLAP with vector support

- [SurrealDB](https://surrealdb.com) – Multi-model DB with vector indexing

- [Zilliz](https://zilliz.com) – Cloud-native vector platform (open components)

## Fine-tuning & Quantization (18)

- [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) – YAML-driven LoRA/QLoRA fine-tuning

- [Unsloth](https://github.com/unslothai/unsloth) – 2× faster fine-tuning on consumer GPUs

- [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) – Web UI for efficient fine-tuning

- [AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ) – GPTQ/AWQ quantization toolkit

- [PEFT](https://github.com/huggingface/peft) – Parameter-efficient fine-tuning methods

- [TRL](https://github.com/huggingface/trl) – RLHF, DPO, PPO training

- [Lit-GPT](https://github.com/Lightning-AI/lit-gpt) – Lightweight fine-tuning with PyTorch Lightning

- [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF) – Scalable RLHF framework

- [DeepSpeed](https://github.com/microsoft/DeepSpeed) – Deep learning optimization library

- [Colossal-AI](https://github.com/hpcaitech/ColossalAI) – Large model training system

- [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) – Efficient transformer training

- [BMTrain](https://github.com/OpenBMB/BMTrain) – Communication-efficient training

- [FSDP](https://pytorch.org/docs/stable/fsdp.html) – Fully Sharded Data Parallel

- [LoRAX](https://github.com/predibase/lorax) – Multi-LoRA serving

- [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes) – 8-bit optimizers and quantization

- [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) – 4-bit LLaMA quantization

- [ExLlama](https://github.com/turboderp/exllama) – Fast LLaMA inference with quantization

- [ExLlamaV2](https://github.com/turboderp/exllamav2) – Optimized quantized inference

## Voice & Multimodal (local) (16)

- [Whisper.cpp](https://github.com/ggerganov/whisper.cpp) – Fast local speech-to-text

- [Coqui TTS](https://github.com/coqui-ai/TTS) – Neural text-to-speech synthesis

- [OpenVoice](https://github.com/myshell-ai/OpenVoice) – Instant voice cloning

- [Silero Models](https://github.com/snakers4/silero-models) – Pre-trained TTS/STT models

- [LLaVA](https://llava-vl.github.io) – Vision + text multimodal chat

- [Moondream2](https://github.com/vikhyat/moondream) – Compact vision-language model

- [Bark](https://github.com/suno-ai/bark) – Text-to-audio with voice cloning

- [Audiocraft](https://github.com/facebookresearch/audiocraft) – Music and audio generation

- [RVC WebUI](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI) – Voice conversion

- [Tortoise TTS](https://github.com/neonbjb/tortoise-tts) – High-quality multi-voice TTS

- [VALL-E X](https://github.com/Plachtaa/VALL-E-X) – Zero-shot TTS from short audio

- [Piper TTS](https://github.com/rhasspy/piper) – Fast neural TTS

- [OpenTTS](https://github.com/synesthesiam/opentts) – Multi-speaker TTS

- [Kosmos-2](https://github.com/microsoft/unilm/tree/master/kosmos-2) – Grounded image-text model

- [ImageBind](https://github.com/facebookresearch/ImageBind) – Multimodal embedding across 6 modalities

- [CLIP](https://github.com/openai/CLIP) – Contrastive language-image pretraining

## Inference Engines & Backends (22)

- [vLLM](https://github.com/vllm-project/vllm) – High-throughput serving with PagedAttention

- [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) – NVIDIA-optimized low-latency inference

- [ExLlamaV2](https://github.com/turboderp/exllamav2) – Blazing-fast quantized inference

- [SGLang](https://github.com/sgl-project/sglang) – Structured generation language

- [MLX](https://github.com/ml-explore/mlx) – Apple Silicon-native framework

- [MLC LLM](https://github.com/mlc-ai/mlc-llm) – Universal deployment engine

- [llama.cpp](https://github.com/ggerganov/llama.cpp) – Lightweight C++ inference

- [ONNX Runtime](https://onnxruntime.ai) – Cross-platform ML accelerator

- [OpenVINO](https://github.com/openvinotoolkit/openvino) – Intel-optimized inference

- [TVM](https://tvm.apache.org) – End-to-end optimizing compiler

- [GGML](https://github.com/ggerganov/ggml) – Tensor library for ML

- [CTranslate2](https://github.com/OpenNMT/CTranslate2) – Fast inference engine

- [FasterTransformer](https://github.com/NVIDIA/FasterTransformer) – NVIDIA transformer decoder

- [TurboTransformers](https://github.com/Tencent/TurboTransformers) – Kernel fusion inference

- [LightLLM](https://github.com/ModelTC/lightllm) – Unified inference framework

- [DeepSpeed-Inference](https://github.com/microsoft/DeepSpeed) – Optimized transformer kernels

- [FlexFlow](https://github.com/flexflow/FlexFlow) – Distributed deep learning

- [Ray Serve](https://docs.ray.io/en/latest/serve) – Scalable model serving

- [BentoML](https://bentoml.com) – ML model serving framework

- [Triton Inference Server](https://github.com/triton-inference-server/server) – Multi-framework serving

- [OpenPPL](https://github.com/openppl-public/ppl.nn) – Neural network inference engine

- [llama.rs](https://github.com/rustformers/llama-rs) – Rust bindings for llama.cpp

## Contribute

Found something missing? → Open a PR! Let’s get to 200+ together

Last updated: December 1, 2025  

Made with ❤️ by [@ethicals7s](https://github.com/ethicals7s)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ethicals7s/awesome-local-ai

Awesome Lists containing this project

README