Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://github.com/NVIDIA/TensorRT-LLM
Last synced: 3 months ago
JSON representation
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
- Host: GitHub
- URL: https://github.com/NVIDIA/TensorRT-LLM
- Owner: NVIDIA
- License: apache-2.0
- Created: 2023-08-16T17:14:27.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-15T07:28:53.000Z (3 months ago)
- Last Synced: 2024-10-16T08:38:06.934Z (3 months ago)
- Language: C++
- Homepage: https://nvidia.github.io/TensorRT-LLM
- Size: 372 MB
- Stars: 8,422
- Watchers: 92
- Forks: 950
- Open Issues: 767
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- Awesome-LLM-Productization - TensorRT-LLM - an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. (Models and Tools / LLM Deployment)
- awesome-local-ai - Tensorrt-llm - Inference efficiently on NVIDIA GPUs | Python / C++ runtimes | Both | ❌ | Python/C++ | Text-Gen | (Inference Engine)
- awesome-llmops - TensorRT-LLM - LLM.svg?style=flat-square) | (Serving / Large Model Serving)
- awesome-LLM-resourses - TensorRT-LLM - LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. (推理 Inference)
- StarryDivineSky - NVIDIA/TensorRT-LLM - LLM 为用户提供了一个易于使用的 Python API,用于定义大型语言模型 (LLM) 并构建包含最先进优化的 TensorRT 引擎,以便在 NVIDIA GPU 上高效执行推理。TensorRT-LLM还包含用于创建Python的组件,以及执行这些TensorRT引擎的C++运行时。 (A01_文本生成_文本对话 / 大语言对话模型及数据)
- Awesome-LLM - TensorRT-LLM - Nvidia Framework for LLM Inference (LLM Deployment)
- Awesome-LLM-Compression - [Code
- awesome-local-llms - TensorRT-LLM - LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines. | 8,543 | 969 | 790 | 16 | 9 | Apache License 2.0 | 2 days, 23 hrs, 33 mins | (Open-Source Local LLM Projects)
- awesome-llm-and-aigc - TensorRT-LLM - LLM?style=social"/> : TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines. [nvidia.github.io/TensorRT-LLM](https://nvidia.github.io/TensorRT-LLM) (Summary)
- awesome-llm-and-aigc - TensorRT-LLM - LLM?style=social"/> : TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines. [nvidia.github.io/TensorRT-LLM](https://nvidia.github.io/TensorRT-LLM) (Summary)
- awesome-yolo-object-detection - NVIDIA/TensorRT-LLM - LLM?style=social"/> : TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines. [nvidia.github.io/TensorRT-LLM](https://nvidia.github.io/TensorRT-LLM) (Lighter and Deployment Frameworks)
- awesome-yolo-object-detection - NVIDIA/TensorRT-LLM - LLM?style=social"/> : TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines. [nvidia.github.io/TensorRT-LLM](https://nvidia.github.io/TensorRT-LLM) (Lighter and Deployment Frameworks)
- awesome-ai-papers - [TensorRT-LLM - inference-server/server)\]\[[GenerativeAIExamples](https://github.com/NVIDIA/GenerativeAIExamples)\]\[[TensorRT-Model-Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer)\]\[[TensorRT](https://github.com/NVIDIA/TensorRT)\]\[[OpenVINO](https://github.com/openvinotoolkit/openvino)\] (NLP / 3. Pretraining)
- awesome-cuda-triton-hpc - TensorRT-LLM - LLM?style=social"/> : TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines. [nvidia.github.io/TensorRT-LLM](https://nvidia.github.io/TensorRT-LLM) (Frameworks)
- alan_awesome_llm - TensorRT-LLM - LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. (推理 Inference)
- alan_awesome_llm - TensorRT-LLM - LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. (推理 Inference)
- awesome-cuda-triton-hpc - TensorRT-LLM - LLM?style=social"/> : TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines. [nvidia.github.io/TensorRT-LLM](https://nvidia.github.io/TensorRT-LLM) (Frameworks)
- AiTreasureBox - NVIDIA/TensorRT-LLM - 01-07_9090_10](https://img.shields.io/github/stars/NVIDIA/TensorRT-LLM.svg)|TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.| (Repos)
- Awesome-LLMOps - TensorRT-LLM - LLM) | ![GitHub Release](https://img.shields.io/github/v/release/NVIDIA/TensorRT-LLM) | ![GitHub contributors](https://img.shields.io/github/contributors/NVIDIA/TensorRT-LLM) | TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.|| (Inference)
- Awesome-LLMOps - TensorRT-LLM - LLM) | ![GitHub Release](https://img.shields.io/github/v/release/NVIDIA/TensorRT-LLM) | ![GitHub contributors](https://img.shields.io/github/contributors/NVIDIA/TensorRT-LLM) | TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.|| (Inference)
- awesome-ai-papers - [TensorRT-LLM - inference-server/server)\]\[[GenerativeAIExamples](https://github.com/NVIDIA/GenerativeAIExamples)\]\[[TensorRT-Model-Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer)\]\[[TensorRT](https://github.com/NVIDIA/TensorRT)\]\[[TransformerEngine](https://github.com/NVIDIA/TransformerEngine)\]\[[OpenVINO](https://github.com/openvinotoolkit/openvino)\] (NLP / 3. Pretraining)