Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with tensorrt-llm

A curated list of projects in awesome lists tagged with tensorrt-llm .

https://github.com/janhq/cortex.cpp

Local AI API Platform

gguf llamacpp onnx onnxruntime tensorrt-llm

Last synced: 18 Dec 2024

https://github.com/shashikg/whispers2t

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

asr deep-learning speech-recognition speech-to-text tensorrt tensorrt-llm vad voice-activity-detection whisper

Last synced: 20 Dec 2024

https://github.com/shashikg/WhisperS2T

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

asr deep-learning speech-recognition speech-to-text tensorrt tensorrt-llm vad voice-activity-detection whisper

Last synced: 14 Nov 2024

https://github.com/huggingface/optimum-benchmark

🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.

benchmark neural-compressor onnxruntime openvino pytorch tensorrt-llm text-generation-inference

Last synced: 04 Dec 2024

https://github.com/netease-media/grps

【深度学习模型部署框架】支持tf/torch/trt/trtllm/vllm以及更多nn框架,支持dynamic batching、streaming模式,支持python/c++双语言,可限制,可拓展,高性能。帮助用户快速地将模型部署到线上,并通过http/rpc接口方式提供服务。

dynamic-batching serving tensorflow tensorrt tensorrt-llm torch triton-inference-server vllm

Last synced: 22 Dec 2024

https://github.com/netease-media/grps_trtllm

【grps接入trtllm】通过GPRS+TensorRT-LLM+Tokenizers.cpp实现纯C++版高性能OpenAI LLM服务,支持chat和function call模式,支持ai agent,支持分布式多卡推理,支持多模态,支持gradio聊天界面。

ai-agent chatglm function-call internvl2 llama-index llama3 llm multi-modal openai qwen-vl qwen2 tensorrt-llm

Last synced: 17 Dec 2024

https://github.com/guidance-ai/llgtrt

TensorRT-LLM server with Structured Outputs (JSON) built with Rust

cfg guidance json openai-api regex structured-generation tensorrt-llm

Last synced: 17 Nov 2024

https://github.com/zrzrzrzrzrzrzr/lm-fly

大模型推理框架加速,让 LLM 飞起来

llm llm-inference mlx openvino tensorrt-llm tgi vllm

Last synced: 29 Oct 2024

https://github.com/j3soon/llm-tutorial

LLM tutorial materials include but not limited to NVIDIA NeMo, TensorRT-LLM, Triton Inference Server, and NeMo Guardrails.

llm nemo nemo-guardrails nvidia-nemo tensorrt-llm

Last synced: 07 Dec 2024