Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

awesome-local-ai

An awesome repository of local AI tools
https://github.com/homebrewltd/awesome-local-ai

Last synced: 1 day ago
JSON representation

  • Lists

    • awesome-local-llms - Table of open-source local LLM inference projects with their GitHub metrics.
    • llama-police - A list of Open Source LLM Tools from [Chip Huyen](https://huyenchip.com)
  • Inference Engine

    • llama.cpp - Inference of LLaMA model in pure C/C++ | GGML/GGUF | Both | ❌ | C/C++ | Text-Gen |
    • Nitro - 3MB inference engine embeddable in your apps. Uses Llamacpp and more | Both | Both | ❌ | Text-Gen |
    • koboldcpp - A simple one-file way to run various GGML models with KoboldAI's UI | GGML | Both | ✅ | C/C++ | Text-Gen |
    • LoLLMS - Lord of Large Language Models Web User Interface. | Nearly ALL | Both | ✅ | Python | Text-Gen |
    • ExLlama - A more memory-efficient rewrite of the HF transformers implementation of Llama | AutoGPTQ/GPTQ | GPU | ✅ | Python/C++ | Text-Gen |
    • vLLM - vLLM is a fast and easy-to-use library for LLM inference and serving. | GGML/GGUF | Both | ❌ | Python | Text-Gen |
    • SGLang - 3-5x higher throughput than vLLM (Control flow, RadixAttention, KV cache reuse) | Safetensor / AWQ / GPTQ | GPU | ❌ | Python | Text-Gen |
    • LmDeploy - LMDeploy is a toolkit for compressing, deploying, and serving LLMs. | Pytorch / Turbomind | Both | ❌ | Python/C++ | Text-Gen |
    • Tensorrt-llm - Inference efficiently on NVIDIA GPUs | Python / C++ runtimes | Both | ❌ | Python/C++ | Text-Gen |
    • CTransformers - Python bindings for the Transformer models implemented in C/C++ using GGML library | GGML/GPTQ | Both | ❌ | C/C++ | Text-Gen |
    • llama-cpp-python - Python bindings for llama.cpp | GGUF | Both | ❌ | Python | Text-Gen |
    • llama2.rs - A fast llama2 decoder in pure Rust | GPTQ | CPU | ❌ | Rust | Text-Gen |
    • ExLlamaV2 - A fast inference library for running LLMs locally on modern consumer-class GPUs | GPTQ/EXL2 | GPU | ❌ | Python/C++ | Text-Gen |
    • LoRAX - Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs | Safetensor / AWQ / GPTQ | GPU | ❌ | Python/Rust | Text-Gen |
    • text-generation-inference - Inference serving toolbox with optimized kernels for each LLM architecture | Safetensors / AWQ / GPTQ | Both | ❌ | Python/Rust | Text-Gen |
    • Nitro - 3MB inference engine embeddable in your apps. Uses Llamacpp and more | Both | Both | ❌ | Text-Gen |
    • ollama - CLI and local server. Uses Llamacpp | Both | Both | ❌ | Text-Gen |
  • Inference UI

    • oobabooga - A Gradio web UI for Large Language Models.
    • LLMFarm - llama and other large language models on iOS and MacOS offline using GGML library.
    • LLM as a Chatbot Service - LLM as a Chatbot Service.
    • Automatic1111 - Stable Diffusion web UI.
    • ComfyUI - A powerful and modular stable diffusion GUI with a graph/nodes interface.
    • Wordflow - Run, share, and discover AI prompts in your browsers
    • petals - Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading.
    • ChatUI - Open source codebase powering the HuggingChat app.
    • AI-Mask - Browser extension to provide model inference to web apps. Backed by web-llm and transformers.js
    • everything-rag - Interact with (virtually) any LLM on Hugging Face Hub with an asy-to-use, 100% local Gradio chatbot.
    • LmScript - UI for SGLang and Outlines
    • LM Studio - Discover, download, and run local LLMs.
    • LlamaChat - LlamaChat allows you to chat with LLaMa, Alpaca and GPT4All models1 all running locally on your Mac.
    • FuLLMetalAi - Fullmetal.Ai is a distributed network of self-hosted Large Language Models (LLMs).
    • LocalAI - LocalAI is a drop-in replacement REST API that’s compatible with OpenAI API specifications for local inferencing.
    • everything-rag - Interact with (virtually) any LLM on Hugging Face Hub with an asy-to-use, 100% local Gradio chatbot.
    • LmScript - UI for SGLang and Outlines
    • faradav - Chat with AI Characters Offline, Runs locally, Zero-configuration.
  • Platforms / full solutions

    • BentoML - BentoML is a framework for building reliable, scalable, and cost-efficient AI applications.
    • H2OAI - H2OGPT The fastest, most accurate AI Cloud Platform.
    • Predibase - Serverless LoRA Fine-Tuning and Serving for LLMs.
  • Developer tools

    • gpt4all - A chatbot trained on a massive collection of clean assistant data including code, stories and dialogue.
    • LiteLLM - Call all LLM APIs using the OpenAI format.
    • PoplarML - PoplarML enables the deployment of production-ready, scalable ML systems with minimal engineering effort.
    • Datature - The All-in-One Platform to Build and Deploy Vision AI
    • Gooey.AI - Create Your Own No Code AI Workflows.
    • Mixo.io - AI website builder
    • GitFluence - The AI-driven solution that helps you quickly find the right command. Get started with Git Command Generator today and save time.
    • Haystack - A framework for building NLP applications (e.g. agents, semantic search, question-answering) with language models.
    • LMQL - LMQL is a query language for large language models.
    • LlamaIndex - A data framework for building LLM applications over external data.
    • Phoenix - Open-source tool for ML observability that runs in your notebook environment, by Arize. Monitor and fine tune LLM, CV and tabular models.
    • trypromptly - Create AI Apps & Chatbots in Minutes
    • BentoML - BentoML is the platform for software engineers to build AI products.
    • Jan Framework - At its core, Jan is a **cross-platform, local-first and AI native** application framework that can be used to build anything.
    • trypromptly - Create AI Apps & Chatbots in Minutes.
    • PoplarML - PoplarML enables the deployment of production-ready, scalable ML systems with minimal engineering effort.
  • User Tools

    • llmcord.py - Discord LLM Chatbot - Talk to LLMs with your friends!
  • Agents

    • BabyAGI - Baby AGI is an autonomous AI agent developed using Python that operates through OpenAI and Pinecone APIs.
    • GPT Prompt Engineer - Automated prompt engineering. It generates, tests, and ranks prompts to find the best ones.
    • MetaGPT - The Multi-Agent Framework: Given one line requirement, return PRD, design, tasks, repo.
    • SuperAGI - Opensource AGI Infrastructure.
    • AgentGPT - Assemble, configure, and deploy autonomous AI Agents in your browser.
    • HyperWrite - HyperWrite helps you work smarter, faster, and with ease.
    • AgentRunner.ai - Leverage the power of GPT-4 to create and train fully autonomous AI agents.
    • Auto-GPT - An experimental open-source attempt to make GPT-4 fully autonomous.
    • GPT Engineer - Specify what you want it to build, the AI asks for clarification, and then builds it.
    • Open Interpreter - Let language models run code. Have your agent write and execute code.
    • CrewAI - Cutting-edge framework for orchestrating role-playing, autonomous AI agents.
    • AI Agents - AI Agent that Power Up Your Productivity.
  • Training

    • FastChat - An open platform for training, serving, and evaluating large language models.
    • DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
    • BMTrain - Efficient Training for Big Models.
    • Alpa - Alpa is a system for training and serving large-scale neural networks.
    • Megatron-LM - Ongoing research training transformer models at scale.
    • Ludwig - Low-code framework for building custom LLMs, neural networks, and other AI models.
    • Nanotron - Minimalistic large language model 3D-parallelism training.
    • TRL - Language model alignment with reinforcement learning.
    • PEFT - Parameter efficient fine-tuning (LoRA, DoRA, model merger and more)
  • LLM Leaderboard

  • Research

  • Community