Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome-LLM-Productization

Awesome-LLM-Productization: a curated list of tools/tricks/news/regulations about AI and Large Language Model (LLM) productization
https://github.com/oscinis-com/Awesome-LLM-Productization

  • Anti-hype LLM reading list
  • ChatGLM-6B - an open bilingual language model based on General Language Model (GLM) framework, with 6.2 billion parameters. (Note from the repo: a small LM to start with so that you can have a taste on prompting & finetuning. You can use a comemrcial grade graphics card with only 8GB to successfully fine tune it without any other financial commitment. You can use it like it is a BERT.)
  • OpenLLM Leaderboard - https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard. (Note from the repo: a good place for you to have a list of avaialble open LLMs, be careful about their comercial terms)
  • MiniGPT-4 - Enhancing Vision-language Understanding with Advanced Large Language Models
  • LLaVA - Visual instruction tuning towards large language and vision models with GPT-4 level capabilities
  • VisualGLM-6B - VisualGLM-6B is an open-source, multi-modal dialog language model that supports images, Chinese, and English.
  • EasyLM - EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax. (Note from the repo: here comes the details of [Jax](https://github.com/google/jax) and [Flax](https://github.com/google/flax))
  • Jina - Jina lets you build multimodal AI services and pipelines that communicate via gRPC, HTTP and WebSockets, then scale them up and deploy to production
  • Pezzo - Open-source, developer-first LLMOps platform designed to streamline prompt design, version management, instant delivery, collaboration, troubleshooting, observability and more.
  • trl - a full stack library where we provide a set of tools to train transformer language models and stable diffusion models with Reinforcement Learning;
  • P-tuning v2 - An optimized prompt tuning strategy achieving comparable performance to fine-tuning on small/medium-sized models and sequence tagging challenges;
  • QLoRA - An efficient finetuning approach that reduces memory usage (Note from the repo: good for smaller dataset finetuning);
  • LLM QLoRA - Fine-tuning LLMs using QLoRA
  • Prompt2Model - Generate Deployable Models from Instructions
  • ElasticSearch - a distributed, RESTful search engine optimized for speed and relevance on production-scale workloads (Java based)
  • pgvector - Open-source vector similarity search for Postgres (C based)
  • Weaviate - an open source vector database that stores both objects and vectors (Go based)
  • Milvus - an open-source vector database built to power embedding similarity search and AI applications (Go based)
  • gensim - a Python library for topic modelling, document indexing and similarity retrieval with large corpora (Python based)
  • txtai - All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows (Python based)
  • Qdrant - High-performance, massive-scale Vector Database for the next generation of AI.(Rust Based)
  • Marqo - Vector search for humans based on Opensearch. (Python based)
  • Vald - A Highly Scalable Distributed Vector Search Engine (Go based)
  • - search, recommendation and personalization need to select a subset of data in a large corpus (Java based)
  • OpenSearch - Open source distributed and RESTful search engine (Java based)
  • ChromaDB - open-source embedding database (Python based - in-memory only at the moment)
  • Ray Serve - Ray Serve is a scalable model serving library for building online inference APIs (Note from the repo: from the [Ray]() project)
  • OpenLLM from BentoML - an open-source platform designed to facilitate the deployment and operation of large language models (LLMs) in real-world applications.
  • Langfuse - Open source observability and analytics for LLM applications
  • text-generation-inference - A Rust, Python and gRPC server for text generation inference. Used in production at HuggingFace to power Hugging Chat, the Inference API and Inference Endpoint
  • vLLM - A high-throughput and memory-efficient inference and serving engine for LLMs
  • mlc-llm - Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
  • llm-awq - Efficient and accurate low-bit weight quantization (INT3/4) for LLMs, supporting instruction-tuned models and multi-modal LMs.
  • streaming-llm - deploy LLMs for infinite-length inputs without sacrificing efficiency and performance.
  • llama2.c - run LLMs on minimum hardware
  • Zep - a fast, scalable building blocks for production LLM apps
  • LlamaGPT - A self-hosted, offline, ChatGPT-like chatbot.
  • Ollama - Get up and running with Llama 2 and other large language models locally
  • OpenObserve - OpenObserve is a cloud native observability platform built specifically for logs, metrics, traces and analytics designed to work at petabyte scale.
  • AuditNLG - an open-source library that can help reduce the risks associated with using generative AI systems for language. The library supports three aspects of trust detection and improvement: Factualness, Safety, and Constraint.
  • MetaGPT - The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo;
  • Doctor Dignity - a Large Language Model that can pass the US Medical Licensing Exam
  • Awesome MLOps - A curated list of awesome MLOps tools
  • MLflow - A Machine Learning Lifecycle Platform
  • dbt - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
  • dvc - data and model versioning tool
  • ml-ops - Some good acticles on machine learning operations