{"id":13443587,"url":"https://github.com/NVIDIA-AI-IOT/torch2trt","last_synced_at":"2025-03-20T16:32:07.997Z","repository":{"id":37493498,"uuid":"183790380","full_name":"NVIDIA-AI-IOT/torch2trt","owner":"NVIDIA-AI-IOT","description":"An easy to use PyTorch to TensorRT converter","archived":false,"fork":false,"pushed_at":"2024-08-17T08:59:30.000Z","size":7897,"stargazers_count":4579,"open_issues_count":338,"forks_count":675,"subscribers_count":74,"default_branch":"master","last_synced_at":"2024-10-10T17:41:24.740Z","etag":null,"topics":["classification","inference","jetson-nano","jetson-tx2","jetson-xavier","pytorch","tensorrt"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NVIDIA-AI-IOT.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-04-27T15:30:56.000Z","updated_at":"2024-10-10T01:16:18.000Z","dependencies_parsed_at":"2024-06-19T02:57:25.405Z","dependency_job_id":"aac45bc2-93d8-4b40-a941-8aeb5c637a5c","html_url":"https://github.com/NVIDIA-AI-IOT/torch2trt","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-AI-IOT%2Ftorch2trt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-AI-IOT%2Ftorch2trt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-AI-IOT%2Ftorch2trt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-AI-IOT%2Ftorch2trt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NVIDIA-AI-IOT","download_url":"https://codeload.github.com/NVIDIA-AI-IOT/torch2trt/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221780076,"owners_count":16879040,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","inference","jetson-nano","jetson-tx2","jetson-xavier","pytorch","tensorrt"],"created_at":"2024-07-31T03:02:04.443Z","updated_at":"2024-10-28T04:31:22.954Z","avatar_url":"https://github.com/NVIDIA-AI-IOT.png","language":"Python","funding_links":[],"categories":["Toolbox","Python","Frameworks \u0026 libraries","Deep Learning Framework","其他_机器学习与深度学习","Lighter and Deployment Frameworks","Applications","Tools","Toolkits \u0026 Libraries"],"sub_categories":["TensorFlow","Machine learning","High-Level DL APIs","Hard-ware Integration","Deployment \u0026 Inference"],"readme":"# torch2trt\n\n\u003e What models are you using, or hoping to use, with TensorRT?  Feel free to join the discussion [here](https://github.com/NVIDIA-AI-IOT/torch2trt/discussions/531).\n \n\u003ca href=\"https://nvidia-ai-iot.github.io/torch2trt\"\u003e\u003cimg src=\"https://img.shields.io/badge/-Documentation-brightgreen\"/\u003e\u003c/a\u003e\n\ntorch2trt is a PyTorch to TensorRT converter which utilizes the \nTensorRT Python API.  The converter is\n\n* Easy to use - Convert modules with a single function call ``torch2trt``\n\n* Easy to extend - Write your own layer converter in Python and register it with ``@tensorrt_converter``\n\nIf you find an issue, please [let us know](../..//issues)!\n\n\u003e Please note, this converter has limited coverage of TensorRT / PyTorch.  We created it primarily\n\u003e to easily optimize the models used in the [JetBot](https://github.com/NVIDIA-AI-IOT/jetbot) project.  If you find the converter helpful with other models, please [let us know](../..//issues).\n\n## Usage\n\nBelow are some usage examples, for more check out the [notebooks](notebooks).\n\n### Convert\n\n```python\nimport torch\nfrom torch2trt import torch2trt\nfrom torchvision.models.alexnet import alexnet\n\n# create some regular pytorch model...\nmodel = alexnet(pretrained=True).eval().cuda()\n\n# create example data\nx = torch.ones((1, 3, 224, 224)).cuda()\n\n# convert to TensorRT feeding sample data as input\nmodel_trt = torch2trt(model, [x])\n```\n\n### Execute\n\nWe can execute the returned ``TRTModule`` just like the original PyTorch model\n\n```python\ny = model(x)\ny_trt = model_trt(x)\n\n# check the output against PyTorch\nprint(torch.max(torch.abs(y - y_trt)))\n```\n\n### Save and load\n\nWe can save the model as a ``state_dict``.\n\n```python\ntorch.save(model_trt.state_dict(), 'alexnet_trt.pth')\n```\n\nWe can load the saved model into a ``TRTModule``\n\n```python\nfrom torch2trt import TRTModule\n\nmodel_trt = TRTModule()\n\nmodel_trt.load_state_dict(torch.load('alexnet_trt.pth'))\n```\n\n## Models\n\nWe tested the converter against these models using the [test.sh](test.sh) script.  You can generate the results by calling\n\n```bash\n./test.sh TEST_OUTPUT.md\n```\n\n\u003e The results below show the throughput in FPS.  You can find the raw output, which includes latency, in the [benchmarks folder](benchmarks).\n\n| Model | Nano (PyTorch) | Nano (TensorRT) | Xavier (PyTorch) | Xavier (TensorRT) |\n|-------|:--------------:|:---------------:|:----------------:|:-----------------:|\n| alexnet | 46.4 | 69.9 | 250 | 580 |\n| squeezenet1_0 | 44 | 137 | 130 | 890 |\n| squeezenet1_1 | 76.6 | 248 | 132 | 1390 |\n| resnet18 | 29.4 | 90.2 | 140 | 712 |\n| resnet34 | 15.5 | 50.7 | 79.2 | 393 |\n| resnet50 | 12.4 | 34.2 | 55.5 | 312 |\n| resnet101 | 7.18 | 19.9 | 28.5 | 170 |\n| resnet152 | 4.96 | 14.1 | 18.9 | 121 |\n| densenet121 | 11.5 | 41.9 | 23.0 | 168 |\n| densenet169 | 8.25 | 33.2 | 16.3 | 118 |\n| densenet201 | 6.84 | 25.4 | 13.3 | 90.9 |\n| densenet161 | 4.71 | 15.6 | 17.2 | 82.4 |\n| vgg11 | 8.9 | 18.3 | 85.2 | 201 |\n| vgg13 | 6.53 | 14.7 | 71.9 | 166 |\n| vgg16 | 5.09 | 11.9 | 61.7 | 139 |\n| vgg19 |  |  | 54.1 | 121 |\n| vgg11_bn | 8.74 | 18.4 | 81.8 | 201 |\n| vgg13_bn | 6.31 | 14.8 | 68.0 | 166 |\n| vgg16_bn | 4.96 | 12.0 | 58.5 | 140 |\n| vgg19_bn |  |  | 51.4 | 121 |\n\n\n## Setup\n\n\u003e Note: torch2trt depends on the TensorRT Python API.  On Jetson, this is included with the latest JetPack.  For desktop, please follow the [TensorRT Installation Guide](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html).  You may also try installing torch2trt inside one of the NGC PyTorch docker containers for [Desktop](https://ngc.nvidia.com/catalog/containers/nvidia:pytorch) or [Jetson](https://ngc.nvidia.com/catalog/containers/nvidia:l4t-pytorch).\n\n### Step 1 - Install the torch2trt Python library\n\nTo install the torch2trt Python library, call the following\n\n```bash\ngit clone https://github.com/NVIDIA-AI-IOT/torch2trt\ncd torch2trt\npython setup.py install\n```\n\n### Step 2 (optional) - Install the torch2trt plugins library\n\nTo install the torch2trt plugins library, call the following\n\n```bash\ncmake -B build . \u0026\u0026 cmake --build build --target install \u0026\u0026 ldconfig\n```\n\nThis includes support for some layers which may not be supported natively by TensorRT.  Once this library is found in the system, the associated layer converters in torch2trt are implicitly enabled.\n\n\u003e Note: torch2trt now maintains plugins as an independent library compiled with CMake.  This makes compiled TensorRT engines more portable.  If needed, the deprecated plugins (which depend on PyTorch) may still be installed by calling ``python setup.py install --plugins``.\n\n### Step 3 (optional) - Install experimental community contributed features\n\nTo install torch2trt with experimental community contributed features under ``torch2trt.contrib``, like Quantization Aware Training (QAT)(`requires TensorRT\u003e=7.0`), call the following,      \n\n```bash\ngit clone https://github.com/NVIDIA-AI-IOT/torch2trt\ncd torch2trt/scripts    \nbash build_contrib.sh   \n```\n  \nThis enables you to run the QAT example located [here](examples/contrib/quantization_aware_training).   \n    \n\n## How does it work?\n\nThis converter works by attaching conversion functions (like ``convert_ReLU``) to the original \nPyTorch functional calls (like ``torch.nn.ReLU.forward``).  The sample input data is passed\nthrough the network, just as before, except now whenever a registered function (``torch.nn.ReLU.forward``)\nis encountered, the corresponding converter (``convert_ReLU``) is also called afterwards.  The converter\nis passed the arguments and return statement of the original PyTorch function, as well as the TensorRT\nnetwork that is being constructed.  The input tensors to the original PyTorch function are modified to\nhave an attribute ``_trt``, which is the TensorRT counterpart to the PyTorch tensor.  The conversion function\nuses this ``_trt`` to add layers to the TensorRT network, and then sets the ``_trt`` attribute for\nrelevant output tensors.  Once the model is fully executed, the final tensors returns are marked as outputs\nof the TensorRT network, and the optimized TensorRT engine is built.\n\n## How to add (or override) a converter\n\nHere we show how to add a converter for the ``ReLU`` module using the TensorRT\npython API.\n\n```python\nimport tensorrt as trt\nfrom torch2trt import tensorrt_converter\n\n@tensorrt_converter('torch.nn.ReLU.forward')\ndef convert_ReLU(ctx):\n    input = ctx.method_args[1]\n    output = ctx.method_return\n    layer = ctx.network.add_activation(input=input._trt, type=trt.ActivationType.RELU)  \n    output._trt = layer.get_output(0)\n```\n\nThe converter takes one argument, a ``ConversionContext``, which will contain\nthe following\n\n* ``ctx.network`` - The TensorRT network that is being constructed.\n\n* ``ctx.method_args`` - Positional arguments that were passed to the specified PyTorch function.  The ``_trt`` attribute is set for relevant input tensors.\n* ``ctx.method_kwargs`` - Keyword arguments that were passed to the specified PyTorch function.\n* ``ctx.method_return`` - The value returned by the specified PyTorch function.  The converter must set the ``_trt`` attribute where relevant.\n\nPlease see [this folder](torch2trt/converters) for more examples.\n\n## See also\n\n- [JetBot](http://github.com/NVIDIA-AI-IOT/jetbot) - An educational AI robot based on NVIDIA Jetson Nano\n\n- [JetRacer](http://github.com/NVIDIA-AI-IOT/jetracer) - An educational AI racecar using NVIDIA Jetson Nano\n- [JetCam](http://github.com/NVIDIA-AI-IOT/jetcam) - An easy to use Python camera interface for NVIDIA Jetson\n- [JetCard](http://github.com/NVIDIA-AI-IOT/jetcard) - An SD card image for web programming AI projects with NVIDIA Jetson Nano\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNVIDIA-AI-IOT%2Ftorch2trt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FNVIDIA-AI-IOT%2Ftorch2trt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNVIDIA-AI-IOT%2Ftorch2trt/lists"}