Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with vllm
A curated list of projects in awesome lists tagged with vllm .
https://github.com/meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
ai finetuning langchain llama llama2 llm machine-learning python pytorch vllm
Last synced: 26 Sep 2024
https://github.com/facebookresearch/llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
ai finetuning langchain llama llama2 llm machine-learning python pytorch vllm
Last synced: 05 Sep 2024
https://github.com/xorbitsai/inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
artificial-intelligence chatglm deployment flan-t5 gemma ggml glm4 inference llama llama3 llamacpp llm machine-learning mistral openai-api pytorch qwen vllm whisper wizardlm
Last synced: 27 Sep 2024
https://github.com/openrlhf/openrlhf
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
deepspeed large-language-models raylib reinforcement-learning reinforcement-learning-from-human-feedback transformers vllm
Last synced: 30 Sep 2024
https://github.com/OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
deepspeed large-language-models raylib reinforcement-learning reinforcement-learning-from-human-feedback transformers vllm
Last synced: 01 Aug 2024
https://github.com/bricks-cloud/BricksLLM
🔒 Enterprise-grade API gateway that helps you monitor and impose cost or rate limits per API key. Get fine-grained access control and monitoring per user, application, or environment. Supports OpenAI, Azure OpenAI, Anthropic, vLLM, and open-source LLMs.
ai anthropic api artificial-intelligence azure docker generative-ai golang gpt llm open-source openai postgresql privacy rest-api security self-hosted vllm ycombinator
Last synced: 01 Aug 2024
https://github.com/prometheus-eval/prometheus-eval
Evaluate your LLM's response with Prometheus and GPT4 💯
evaluation gpt4 litellm llm llm-as-a-judge llm-as-evaluator llmops python vllm
Last synced: 01 Aug 2024
https://github.com/varunshenoy/super-json-mode
Low latency JSON generation using LLMs ⚡️
huggingface-transformers llm openai vllm
Last synced: 03 Aug 2024
https://github.com/substratusai/kubeai
Private Open AI on Kubernetes
ai autoscaler faster-whisper inference-operator k8s kubernetes llm ollama ollama-operator openai-api vllm vllm-operator whisper
Last synced: 27 Sep 2024
https://github.com/av/harbor
Effortlessly run LLM backends, APIs, frontends, and services with one command.
ai cli cmdh docker docker-compose exllamav2 huggingface llama llamacpp llm mistral ollama open-interpreter self-hosted text-generation-inference tool tools vllm
Last synced: 27 Sep 2024
https://github.com/runpod-workers/worker-vllm
The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
language-model llm runpod vllm
Last synced: 01 Aug 2024
https://github.com/containers/ramalama
The goal of ramalama is to make working with AI boring.
ai containers inference-server llamacpp llms local podman vllm
Last synced: 01 Oct 2024
https://github.com/Trainy-ai/llm-atc
Fine-tuning and serving LLMs on any cloud
Last synced: 02 Aug 2024
https://github.com/microsoft/vidur
A large-scale simulation framework for LLM inference
inference llm simulation transformer vllm
Last synced: 03 Aug 2024
https://github.com/navinkumarmnk/ai-learning-platform
AI-Learning-Platform, a LLM-RAG pipeline which behaves like a guide and able to solve doubts. Deployed on-premise IBM ppc64le architecture. vLLM for model inference & Qdrant with Langchain for RAG Pipeline. Server written in django, postgres & cassandra as the sql & nosql databases.
cassandra django langchain llm postgresql ppc64le qdrant ray-distributed vllm
Last synced: 29 Sep 2024
https://github.com/argonne-lcf/llm-inference-bench
LLM-Inference-Bench
benchmark deepspeed inference llamacpp llm tensorrt-llm vllm
Last synced: 01 Oct 2024
https://github.com/TimeSurgeLabs/promptproxy
Call many AIs from a single API.
ai docker huggingface llama llama2 llm openai openai-api openai-api-proxy vllm
Last synced: 01 Aug 2024
https://github.com/evilpsycho/open-llm-benchmark
Evaluate open-source language models on Agent, formatted output, command following, long text, multilingual, coding, and custom task capabilities. 开源语言模型在Agent,格式化输出,指令追随,长文本,多语言,代码,自定义任务的能力基准测试。
evaluation-framework huggingface large-language-models llamacpp llm-agent llms-benchmarking openai vllm
Last synced: 27 Sep 2024
https://github.com/hemanthpai/hass-ai-assistant
A Home Assistant integration to control your smart home using a local, self hosted LLM
ai functionary hacs hacs-integration hass hassio-integration home-assistant vllm
Last synced: 27 Sep 2024