{"id":14958079,"url":"https://github.com/pytorch/tensorrt","last_synced_at":"2026-04-16T20:01:41.848Z","repository":{"id":37017915,"uuid":"246634306","full_name":"pytorch/TensorRT","owner":"pytorch","description":"PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT","archived":false,"fork":false,"pushed_at":"2025-05-11T01:00:13.000Z","size":168836,"stargazers_count":2744,"open_issues_count":231,"forks_count":364,"subscribers_count":71,"default_branch":"main","last_synced_at":"2025-05-11T03:39:38.357Z","etag":null,"topics":["cuda","deep-learning","jetson","libtorch","machine-learning","nvidia","pytorch","tensorrt"],"latest_commit_sha":null,"homepage":"https://pytorch.org/TensorRT","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pytorch.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-03-11T17:17:43.000Z","updated_at":"2025-05-10T21:57:22.000Z","dependencies_parsed_at":"2022-07-12T16:13:19.422Z","dependency_job_id":"1c5182b0-d9b7-48e3-ba64-7b94822b8c53","html_url":"https://github.com/pytorch/TensorRT","commit_stats":{"total_commits":3439,"total_committers":127,"mean_commits":"27.078740157480315","dds":0.7871474265774935,"last_synced_commit":"43eb5605349b09ea5aa25a1c010345df012a25ad"},"previous_names":["nvidia/torch-tensorrt","nvidia/trtorch"],"tags_count":61,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pytorch%2FTensorRT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pytorch%2FTensorRT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pytorch%2FTensorRT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pytorch%2FTensorRT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pytorch","download_url":"https://codeload.github.com/pytorch/TensorRT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253514555,"owners_count":21920334,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cuda","deep-learning","jetson","libtorch","machine-learning","nvidia","pytorch","tensorrt"],"created_at":"2024-09-24T13:16:11.682Z","updated_at":"2026-01-17T17:37:29.186Z","avatar_url":"https://github.com/pytorch.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\nTorch-TensorRT\n===========================\n\u003ch4\u003e Easily achieve the best inference performance for any PyTorch model on the NVIDIA platform. \u003c/h4\u003e\n\n[![Documentation](https://img.shields.io/badge/docs-master-brightgreen)](https://nvidia.github.io/Torch-TensorRT/)\n[![pytorch](https://img.shields.io/badge/PyTorch-2.11-green)](https://download.pytorch.org/whl/nightly/cu130)\n[![cuda](https://img.shields.io/badge/CUDA-13.0-green)](https://developer.nvidia.com/cuda-downloads)\n[![trt](https://img.shields.io/badge/TensorRT-10.14.1-green)](https://github.com/nvidia/tensorrt)\n[![license](https://img.shields.io/badge/license-BSD--3--Clause-blue)](./LICENSE)\n[![Linux x86-64 Nightly Wheels](https://github.com/pytorch/TensorRT/actions/workflows/build-test-linux-x86_64.yml/badge.svg?branch=nightly)](https://github.com/pytorch/TensorRT/actions/workflows/build-test-linux-x86_64.yml)\n[![Linux SBSA Nightly Wheels](https://github.com/pytorch/TensorRT/actions/workflows/build-test-linux-aarch64.yml/badge.svg?branch=nightly)](https://github.com/pytorch/TensorRT/actions/workflows/build-test-linux-aarch64.yml)\n[![Windows Nightly Wheels](https://github.com/pytorch/TensorRT/actions/workflows/build-test-windows.yml/badge.svg?branch=nightly)](https://github.com/pytorch/TensorRT/actions/workflows/build-test-windows.yml)\n\n---\n\u003cdiv align=\"left\"\u003e\n\nTorch-TensorRT brings the power of TensorRT to PyTorch. Accelerate inference latency by up to 5x compared to eager execution in just one line of code.\n\u003c/div\u003e\u003c/div\u003e\n\n## Installation\nStable versions of Torch-TensorRT are published on PyPI\n```bash\npip install torch-tensorrt\n```\n\nNightly versions of Torch-TensorRT are published on the PyTorch package index\n```bash\npip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu130\n```\n\nTorch-TensorRT is also distributed in the ready-to-run [NVIDIA NGC PyTorch Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch) which has all dependencies with the proper versions and example notebooks included.\n\nFor more advanced installation  methods, please see [here](https://pytorch.org/TensorRT/getting_started/installation.html)\n\n## Quickstart\n\n### Option 1: torch.compile\nYou can use Torch-TensorRT anywhere you use `torch.compile`:\n\n```python\nimport torch\nimport torch_tensorrt\n\nmodel = MyModel().eval().cuda() # define your model here\nx = torch.randn((1, 3, 224, 224)).cuda() # define what the inputs to the model will look like\n\noptimized_model = torch.compile(model, backend=\"tensorrt\")\noptimized_model(x) # compiled on first run\n\noptimized_model(x) # this will be fast!\n```\n\n### Option 2: Export\nIf you want to optimize your model ahead-of-time and/or deploy in a C++ environment, Torch-TensorRT provides an export-style workflow that serializes an optimized module. This module can be deployed in PyTorch or with libtorch (i.e. without a Python dependency).\n\n#### Step 1: Optimize + serialize\n```python\nimport torch\nimport torch_tensorrt\n\nmodel = MyModel().eval().cuda() # define your model here\ninputs = [torch.randn((1, 3, 224, 224)).cuda()] # define a list of representative inputs here\n\ntrt_gm = torch_tensorrt.compile(model, ir=\"dynamo\", inputs=inputs)\ntorch_tensorrt.save(trt_gm, \"trt.ep\", inputs=inputs) # PyTorch only supports Python runtime for an ExportedProgram. For C++ deployment, use a TorchScript file\ntorch_tensorrt.save(trt_gm, \"trt.ts\", output_format=\"torchscript\", inputs=inputs)\n```\n\n#### Step 2: Deploy\n##### Deployment in PyTorch:\n```python\nimport torch\nimport torch_tensorrt\n\ninputs = [torch.randn((1, 3, 224, 224)).cuda()] # your inputs go here\n\n# You can run this in a new python session!\nmodel = torch.export.load(\"trt.ep\").module()\n# model = torch_tensorrt.load(\"trt.ep\").module() # this also works\nmodel(*inputs)\n```\n\n##### Deployment in C++:\n```cpp\n#include \"torch/script.h\"\n#include \"torch_tensorrt/torch_tensorrt.h\"\n\nauto trt_mod = torch::jit::load(\"trt.ts\");\nauto input_tensor = [...]; // fill this with your inputs\nauto results = trt_mod.forward({input_tensor});\n```\n\n## Further resources\n- [Double PyTorch Inference Speed for Diffusion Models Using Torch-TensorRT](https://developer.nvidia.com/blog/double-pytorch-inference-speed-for-diffusion-models-using-torch-tensorrt/)\n- [Up to 50% faster Stable Diffusion inference with one line of code](https://pytorch.org/TensorRT/tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion.html#sphx-glr-tutorials-rendered-examples-dynamo-torch-compile-stable-diffusion-py)\n- [Optimize LLMs from Hugging Face with Torch-TensorRT](https://docs.pytorch.org/TensorRT/tutorials/compile_hf_models.html#compile-hf-models)\n- [Run your model in FP8 with Torch-TensorRT](https://pytorch.org/TensorRT/tutorials/_rendered_examples/dynamo/vgg16_fp8_ptq.html)\n- [Accelerated Inference in PyTorch 2.X with Torch-TensorRT](https://www.youtube.com/watch?v=eGDMJ3MY4zk\u0026t=1s)\n- [Tools to resolve graph breaks and boost performance]() \\[coming soon\\]\n- [Tech Talk (GTC '23)](https://www.nvidia.com/en-us/on-demand/session/gtcspring23-s51714/)\n- [Documentation](https://nvidia.github.io/Torch-TensorRT/)\n\n\n## Platform Support\n\n| Platform            | Support                                          |\n| ------------------- | ------------------------------------------------ |\n| Linux AMD64 / GPU   | **Supported**                                    |\n| Linux SBSA / GPU    | **Supported**                                    |\n| Windows / GPU       | **Supported (Dynamo only)**                      |\n| Linux Jetson / GPU | **Source Compilation Supported on JetPack-4.4+**  |\n| Linux Jetson / DLA | **Source Compilation Supported on JetPack-4.4+**  |\n| Linux ppc64le / GPU | Not supported                                    |\n\n\u003e Note: Refer [NVIDIA L4T PyTorch NGC container](https://ngc.nvidia.com/catalog/containers/nvidia:l4t-pytorch) for PyTorch libraries on JetPack.\n\n### Dependencies\n\nThese are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.\n\n- Bazel 8.1.1\n- Libtorch 2.11.0.dev (latest nightly)\n- CUDA 13.0 (CUDA 12.6 on Jetson)\n- TensorRT 10.14.1.48 (TensorRT 10.3 on Jetson)\n\n## Deprecation Policy\n\nDeprecation is used to inform developers that some APIs and tools are no longer recommended for use. Beginning with version 2.3, Torch-TensorRT has the following deprecation policy:\n\nDeprecation notices are communicated in the Release Notes. Deprecated API functions will have a statement in the source documenting when they were deprecated. Deprecated methods and classes will issue deprecation warnings at runtime, if they are used. Torch-TensorRT provides a 6-month migration period after the deprecation. APIs and tools continue to work during the migration period. After the migration period ends, APIs and tools are removed in a manner consistent with semantic versioning.\n\n## Contributing\n\nTake a look at the [CONTRIBUTING.md](CONTRIBUTING.md)\n\n\n## License\n\nThe Torch-TensorRT license can be found in the [LICENSE](./LICENSE) file. It is licensed with a BSD Style licence\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpytorch%2Ftensorrt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpytorch%2Ftensorrt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpytorch%2Ftensorrt/lists"}