{"id":13576263,"url":"https://github.com/parlaynu/inference-tensorrt","last_synced_at":"2025-06-12T11:33:49.422Z","repository":{"id":225717872,"uuid":"726695082","full_name":"parlaynu/inference-tensorrt","owner":"parlaynu","description":"Convert ONNX models to TensorRT engines and run inference in containerized environments","archived":false,"fork":false,"pushed_at":"2024-03-03T21:49:12.000Z","size":10,"stargazers_count":10,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-05-09T02:24:41.307Z","etag":null,"topics":["docker","jetson-nano","nvidia-gpu","onnx","python","pyzmq","tensorrt-inference","zeromq"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/parlaynu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-12-03T05:13:00.000Z","updated_at":"2025-02-28T11:08:17.000Z","dependencies_parsed_at":"2024-03-03T22:49:22.740Z","dependency_job_id":null,"html_url":"https://github.com/parlaynu/inference-tensorrt","commit_stats":null,"previous_names":["parlaynu/inference-tensorrt"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/parlaynu/inference-tensorrt","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parlaynu%2Finference-tensorrt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parlaynu%2Finference-tensorrt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parlaynu%2Finference-tensorrt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parlaynu%2Finference-tensorrt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/parlaynu","download_url":"https://codeload.github.com/parlaynu/inference-tensorrt/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parlaynu%2Finference-tensorrt/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259456091,"owners_count":22860484,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","jetson-nano","nvidia-gpu","onnx","python","pyzmq","tensorrt-inference","zeromq"],"created_at":"2024-08-01T15:01:08.589Z","updated_at":"2025-06-12T11:33:49.343Z","avatar_url":"https://github.com/parlaynu.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Inference Using Nvidia TensorRT\n\nThis repository has tools and guidelines for converting ONNX models to [TensortRT](https://developer.nvidia.com/tensorrt)\nengines and running classification inference using the exported model. \n\nThe tools include:\n\n* bash script wrapper for trtexe \n* running inference using the exported engine\n\nThe `tools` directory contains the source code in python for the onnx2trt conversion and the inference. It builds \non the tools in [inference-onnx](https://github.com/parlaynu/inference-onnx). Models converted to ONNX using the \n`inference-onnx` project can be used as input to the tools here.\n\nThe `platforms` directory contains the tooling to build docker images with the tools and packages to\nrun the conversion and inference.\n\nEach platform needs to do its own conversion as the TensorRT engine is a binary format matched to the GPU\non the system.\n\n## The Tools\n\n### Convert ONNX to TensorRT\n\nThis tools converts an ONNX model to a TensorRT engine. It is a wrapper script around the Nvidia tool `trtexec`.\n\nThe full usage is:\n\n    $  ./onnx2trt.sh \n    Usage: onnx2trt.sh model.onnx\n\nThe containers built by the platform tools mount a directory called `models` from the host file system which\ncan be used as the source for ONNX model files.\n\n### Running Inference\n\nThe tool `classify-trt.py` runs inference on the exported TensorRT engine. The full usage is:\n\n    $ ./classify-trt.py -h\n    usage: classify-trt.py [-h] [-l LIMIT] [-r RATE] engine dataspec\n    \n    positional arguments:\n      engine                path to the tensorrt engine file\n      dataspec              the data source specification\n    \n    options:\n      -h, --help            show this help message and exit\n      -l LIMIT, --limit LIMIT\n                            maximum number of images to process\n      -r RATE, --rate RATE  requests per second\n\nA simple run using a camera server from the `inference-onnx` project looks like this:\n\n    $ ./classify-trt.py -l 10 ../models/resnet18-1x3x224x224.trt tcp://192.168.24.31:8089\n    loading engine...\n    - input shape: [1, 3, 224, 224]\n    - output shape: [1, 1000]\n    00 image_0000 640x480x3\n       315 @ 51.36\n    01 image_0001 640x480x3\n       315 @ 25.14\n    02 image_0002 640x480x3\n       315 @ 50.38\n    03 image_0003 640x480x3\n       315 @ 37.04\n    04 image_0004 640x480x3\n       315 @ 27.28\n    05 image_0005 640x480x3\n       315 @ 46.95\n    06 image_0006 640x480x3\n       315 @ 41.22\n    07 image_0007 640x480x3\n       315 @ 52.79\n    08 image_0008 640x480x3\n       315 @ 53.13\n    09 image_0009 640x480x3\n       315 @ 47.75\n    runtime: 0 seconds\n        fps: 13.13\n\nSee the [inference-onnx](https://github.com/parlaynu/inference-onnx) project for details on the camera server.\n\n## The Platforms\n\nUnder the `platforms` directory, there is a directory for each platform supported. This project builds a single \ncontainer that can be used by all the tools. \n\nIn the platform directory are the tools to build the conversion container and launch it.\n\nUse the `build.sh` script to build the container. This does everything automatically including downloading the \nTVM source code and compiling it and building and installing the python package. This takes some time on the\nJetsonNano and RaspberryPi4 platforms.\n\n    ./build.sh\n\nUse the `run-latest.sh` script to launch the container with the correct parameters:\n\n    $ ./run-latest.sh\n    \n    root@eximius:/workspace# ls\n    inference  models  onnx2trt\n\nThe `models` directory is mounted from the host system from ${HOME}/Workspace/models. Place any models you want to convert\ninto this directory so they are accessible from this container.\n\n## References\n\n* https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/index.html\n* https://documen.tician.de/pycuda/\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparlaynu%2Finference-tensorrt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fparlaynu%2Finference-tensorrt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparlaynu%2Finference-tensorrt/lists"}