Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Awesome-LLM-Productization
Awesome-LLM-Productization: a curated list of tools/tricks/news/regulations about AI and Large Language Model (LLM) productization
https://github.com/oscinis-com/Awesome-LLM-Productization
Last synced: 1 day ago
JSON representation
-
Models and Tools
-
Open LLM Models
- ChatGLM-6B - an open bilingual language model based on General Language Model (GLM) framework, with 6.2 billion parameters. (Note from the repo: a small LM to start with so that you can have a taste on prompting & finetuning. You can use a comemrcial grade graphics card with only 8GB to successfully fine tune it without any other financial commitment. You can use it like it is a BERT.)
- MiniGPT-4 - Enhancing Vision-language Understanding with Advanced Large Language Models
- LLaVA - Visual instruction tuning towards large language and vision models with GPT-4 level capabilities
- VisualGLM-6B - VisualGLM-6B is an open-source, multi-modal dialog language model that supports images, Chinese, and English.
- OpenLLM Leaderboard - https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard. (Note from the repo: a good place for you to have a list of avaialble open LLMs, be careful about their comercial terms)
- OpenLLM Leaderboard - https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard. (Note from the repo: a good place for you to have a list of avaialble open LLMs, be careful about their comercial terms)
-
Full LLM Lifecycle
- EasyLM - EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax. (Note from the repo: here comes the details of [Jax](https://github.com/google/jax) and [Flax](https://github.com/google/flax))
- Jina - Jina lets you build multimodal AI services and pipelines that communicate via gRPC, HTTP and WebSockets, then scale them up and deploy to production
-
LLM Prompt Management
- Pezzo - Open-source, developer-first LLMOps platform designed to streamline prompt design, version management, instant delivery, collaboration, troubleshooting, observability and more.
-
LLM Finetuning
- trl - a full stack library where we provide a set of tools to train transformer language models and stable diffusion models with Reinforcement Learning;
- P-tuning v2 - An optimized prompt tuning strategy achieving comparable performance to fine-tuning on small/medium-sized models and sequence tagging challenges;
- QLoRA - An efficient finetuning approach that reduces memory usage (Note from the repo: good for smaller dataset finetuning);
- LLM QLoRA - Fine-tuning LLMs using QLoRA
- Prompt2Model - Generate Deployable Models from Instructions
-
Embeddings
- clip-as-service - a low-latency high-scalability service for embedding images and text. It can be easily integrated as a microservice into neural search solutions (Python based, Apache 2);
- text-embeddings-inference - a toolkit for deploying and serving open source text embeddings and sequence classification models, enabling high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5 (Rust based; Apache 2);
- infinity - a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks (Python based, MIT);
-
Vector Store
- ElasticSearch - a distributed, RESTful search engine optimized for speed and relevance on production-scale workloads (Java based)
- pgvector - Open-source vector similarity search for Postgres (C based)
- Weaviate - an open source vector database that stores both objects and vectors (Go based)
- Milvus - an open-source vector database built to power embedding similarity search and AI applications (Go based)
- gensim - a Python library for topic modelling, document indexing and similarity retrieval with large corpora (Python based)
- txtai - All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows (Python based)
- Qdrant - High-performance, massive-scale Vector Database for the next generation of AI.(Rust Based)
- Marqo - Vector search for humans based on Opensearch. (Python based)
- Vald - A Highly Scalable Distributed Vector Search Engine (Go based)
- - search, recommendation and personalization need to select a subset of data in a large corpus (Java based)
- OpenSearch - Open source distributed and RESTful search engine (Java based)
- ChromaDB - open-source embedding database (Python based - in-memory only at the moment)
- gensim - a Python library for topic modelling, document indexing and similarity retrieval with large corpora (Python based)
-
LLM Deployment
- Ray Serve - Ray Serve is a scalable model serving library for building online inference APIs (Note from the repo: from the [Ray]() project)
- OpenLLM from BentoML - an open-source platform designed to facilitate the deployment and operation of large language models (LLMs) in real-world applications.
- Langfuse - Open source observability and analytics for LLM applications
- vLLM - A high-throughput and memory-efficient inference and serving engine for LLMs
- mlc-llm - Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
- llm-awq - Efficient and accurate low-bit weight quantization (INT3/4) for LLMs, supporting instruction-tuned models and multi-modal LMs.
- streaming-llm - deploy LLMs for infinite-length inputs without sacrificing efficiency and performance.
- llama2.c - run LLMs on minimum hardware
- TensorRT-LLM - an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.
- text-generation-inference - Large Language Model Text Generation Inference
-
LLM Boilerplate
-
LLM Monitoring
- OpenObserve - OpenObserve is a cloud native observability platform built specifically for logs, metrics, traces and analytics designed to work at petabyte scale.
- AuditNLG - an open-source library that can help reduce the risks associated with using generative AI systems for language. The library supports three aspects of trust detection and improvement: Factualness, Safety, and Constraint.
-
Use Cases
- MetaGPT - The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo;
- Doctor Dignity - a Large Language Model that can pass the US Medical Licensing Exam
-
General MLOps Tools
- Awesome MLOps - A curated list of awesome MLOps tools
- MLflow - A Machine Learning Lifecycle Platform
- dvc - data and model versioning tool
- dbt - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
- ml-ops - Some good acticles on machine learning operations
-
-
The Survey Paper
Programming Languages
Categories
Sub Categories
Keywords
llm
13
machine-learning
11
ai
9
search-engine
7
gpt
7
deep-learning
6
mlops
6
python
6
vector-search
6
data-science
5
information-retrieval
5
natural-language-processing
5
neural-search
5
llama
5
nearest-neighbor-search
5
llmops
5
openai
4
vector-search-engine
4
vector-database
4
pytorch
4
ml
4
image-search
4
java
4
approximate-nearest-neighbor-search
4
hnsw
4
nlp
4
language-model
4
llama2
4
large-language-models
4
observability
3
analytics
3
monitoring
3
similarity-search
3
langchain
3
semantic-search
3
embeddings
3
search
3
neural-network
3
chatgpt
3
gpt-4
3
cloud-native
3
transformer
3
llm-serving
3
word2vec
2
distributed
2
serving
2
inference
2
tensorflow
2
self-hosted
2
chatbot
2