Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://github.com/NVIDIA/TensorRT-LLM
Last synced: about 2 months ago
JSON representation
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
- Host: GitHub
- URL: https://github.com/NVIDIA/TensorRT-LLM
- Owner: NVIDIA
- License: apache-2.0
- Created: 2023-08-16T17:14:27.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-07-30T15:43:32.000Z (about 2 months ago)
- Last Synced: 2024-07-30T16:55:25.025Z (about 2 months ago)
- Language: C++
- Homepage: https://nvidia.github.io/TensorRT-LLM
- Size: 298 MB
- Stars: 7,727
- Watchers: 87
- Forks: 841
- Open Issues: 724
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-local-llms - TensorRT-LLM - LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines. | 7,100 | 762 | 612 | 13 | 6 | Apache License 2.0 | 0 days, 8 hrs, 34 mins | (Open-Source Local LLM Projects)
- Awesome-LLM-Productization - TensorRT-LLM - an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. (Models and Tools / LLM Deployment)
- awesome-local-ai - Tensorrt-llm - Inference efficiently on NVIDIA GPUs | Python / C++ runtimes | Both | ❌ | Python/C++ | Text-Gen | (Inference Engine)
- awesome-llmops - TensorRT-LLM - LLM.svg?style=flat-square) | (Serving / Large Model Serving)
- awesome-llm-and-aigc - TensorRT-LLM - LLM?style=social"/> : TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines. [nvidia.github.io/TensorRT-LLM](https://nvidia.github.io/TensorRT-LLM) (Summary)
- awesome-yolo-object-detection - NVIDIA/TensorRT-LLM - LLM?style=social"/> : TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines. [nvidia.github.io/TensorRT-LLM](https://nvidia.github.io/TensorRT-LLM) (Lighter and Deployment Frameworks)
- awesome-LLM-resourses - TensorRT-LLM - LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. (推理 Inference)
- StarryDivineSky - NVIDIA/TensorRT-LLM - LLM 为用户提供了一个易于使用的 Python API,用于定义大型语言模型 (LLM) 并构建包含最先进优化的 TensorRT 引擎,以便在 NVIDIA GPU 上高效执行推理。TensorRT-LLM还包含用于创建Python的组件,以及执行这些TensorRT引擎的C++运行时。 (文本生成、文本对话 / 大语言对话模型及数据)
- Awesome-LLM - TensorRT-LLM - Nvidia Framework for LLM Inference (LLM Deployment)
- Awesome-LLM-Compression - [Code
- awesome-cuda-and-hpc - TensorRT-LLM - LLM?style=social"/> : TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines. [nvidia.github.io/TensorRT-LLM](https://nvidia.github.io/TensorRT-LLM) (Frameworks)
- AiTreasureBox - NVIDIA/TensorRT-LLM - 09-08_8112_0](https://img.shields.io/github/stars/NVIDIA/TensorRT-LLM.svg)|TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.| (Repos)