{"id":23177275,"url":"https://github.com/haoming02/tensorrt-cpp","last_synced_at":"2025-07-27T06:36:01.249Z","repository":{"id":242486802,"uuid":"809622773","full_name":"Haoming02/TensorRT-Cpp","owner":"Haoming02","description":"Example TensorRT Program written in C++","archived":false,"fork":false,"pushed_at":"2024-07-08T02:27:16.000Z","size":31,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-05T01:41:49.519Z","etag":null,"topics":["cpp","image-super-resolution","tensorrt"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Haoming02.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-03T06:11:32.000Z","updated_at":"2025-03-28T04:00:33.000Z","dependencies_parsed_at":"2024-06-03T10:40:43.162Z","dependency_job_id":"2188a046-6717-48f7-86da-8a1db4d2045e","html_url":"https://github.com/Haoming02/TensorRT-Cpp","commit_stats":null,"previous_names":["haoming02/tensorrt-cpp"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/Haoming02/TensorRT-Cpp","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Haoming02%2FTensorRT-Cpp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Haoming02%2FTensorRT-Cpp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Haoming02%2FTensorRT-Cpp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Haoming02%2FTensorRT-Cpp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Haoming02","download_url":"https://codeload.github.com/Haoming02/TensorRT-Cpp/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Haoming02%2FTensorRT-Cpp/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267317720,"owners_count":24068482,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-27T02:00:11.917Z","response_time":82,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpp","image-super-resolution","tensorrt"],"created_at":"2024-12-18T06:32:37.481Z","updated_at":"2025-07-27T06:36:01.227Z","avatar_url":"https://github.com/Haoming02.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"﻿# TensorRT C++\nA simple program that implements the **[NVIDIA TensorRT](https://developer.nvidia.com/tensorrt-getting-started)** SDK for high-performance deep learning inference, written in C++\n\n## Features\n- **Caption**\n    - Generate a caption of the image using Booru tags\n- **Upscale**\n    - Super resolution the image using a model\n- *more coming soon...?*\n\n## Getting Started\n*(for Windows)*\n\n#### Requirements\n0. Nvidia **RTX** GPU\n1. [TensorRT 10.0 SDK](https://developer.nvidia.com/tensorrt/download)\n    \u003e An Nvidia Developer account is needed\n2. [CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit-archive)\n    \u003e Be sure to download the release specified by your TensorRT version\n3. [OpenCV 4.10.0](https://github.com/opencv/opencv/releases/tag/4.10.0)\n    \u003e It needs to be exactly this version, unless you're planning to build from source\n\n\u003e Recommended to add the OpenCV `bin` folder to your system **PATH**; otherwise, you have to manually place `opencv_world4100.dll` next to the `.exe`; TensorRT and CUDA Toolkit `bin` folders should be included in **PATH** already during installation\n\n#### Models\n\u003e For optional arguments during engine conversion, refer to the [trtexec](#trtexec) section\n\n- **Caption**:\n    1. Go to [SmilingWolf](https://huggingface.co/SmilingWolf)'s HuggingFace\n    2. Select a tagger model of choice\n        \u003e This program was built and tested on [WD SwinV2 Tagger v3](https://huggingface.co/SmilingWolf/wd-swinv2-tagger-v3)\n    3. Download **both** the `.onnx` and the `.csv` files\n    4. Convert the `.onnx` model to a `.trt` engine\n        - \u003cins\u003e\u003cb\u003eExample\u003c/b\u003e\u003c/ins\u003e\n            ```bash\n            trtexec --onnx=model.onnx --saveEngine=model.trt --fp16\n            ```\n    5. Modify the `config.json` file accordingly *(see below)*\n\n- **Upscale**:\n    1. Go to [OpenModelDB](https://openmodeldb.info/)\n    2. Expand the `Advanced tag selector`, and filter the **Platform** to `ONNX` format\n    3. Download a model of choice\n        \u003e This program was built and tested on [4x-Nomos8kDAT](https://openmodeldb.info/models/4x-Nomos8kDAT)\n    4. Convert the `.onnx` model to a `.trt` engine\n        - \u003cins\u003e\u003cb\u003eExample\u003c/b\u003e\u003c/ins\u003e\n            ```bash\n            trtexec --onnx=4xNomos8kDAT.onnx --saveEngine=4xNomos8kDAT.trt --shapes=input:1x3x128x128 --inputIOFormats=fp32:chw --outputIOFormats=fp32:chw\n            ```\n    5. Modify the `config.json` file accordingly *(see below)*\n\n#### Configs\n\u003e Inside the `config.json` file, you need to have the following fields:\n\n- \u003cins\u003e\u003cb\u003eRequired\u003c/b\u003e\u003c/ins\u003e\n    - **deviceID:** The ID of the CUDA device\n        \u003e Should be `0` if you only have one GPU\n    - **mode:** `\"caption\"` or `\"upscale\"`\n    - **modelPath:** The path to the `.trt` engine\n        \u003e Use **absolute** path so it supports drag \u0026 drop\n    - **inputResolution:** Should be `448` for most tagger models; `64` or `128` for most upscale models\n    - **fp16:** Enable to use half precision **I/O**\n\n- \u003cins\u003e\u003cb\u003eCaption\u003c/b\u003e\u003c/ins\u003e\n    - **tagsPath:** The path to the `.csv` tags spreadsheet\n        \u003e Use **absolute** path so it supports drag \u0026 drop\n    - **threshold:** The score needed for a tag to be included\n\n- \u003cins\u003e\u003cb\u003eUpscale\u003c/b\u003e\u003c/ins\u003e\n    - **overlap:** The overlap between each tile\n        \u003e This is to prevent seams\n    - **upscaleRatio:** The `Scale` of your upscale model\n\n## Deployment\nIf you simply want to run the program:\n\n1. Download the built `.exe` from [Releases](https://github.com/Haoming02/TensorRT-Cpp/releases)\n2. Place the `config.json` next to the `.exe`\n3. Launch the `.exe`\n\n## Development\nIf you want to build from source:\n\n0. Install [Visual Studio](https://visualstudio.microsoft.com/downloads/) with **C++** module\n1. `git` `clone` this repo\n2. Open the `.vcxproj` project\n3. Modify the `CUDA.props` to point to the correct paths\n    - TensorRT\n    - CUDA Toolkit\n    - OpenCV\n4. Download the [Json for C++](https://github.com/nlohmann/json/releases) package, and add the single-file `json.hpp`\n5. Download the [CSV for C++](https://github.com/d99kris/rapidcsv/releases) package, and add the single-file `rapidcsv.h`\n6. Configure the solution to `Release` *(instead of `Debug`)*\n7. Build\n\n\u003e For other OS, you will need to modify `path_util.cpp` to use platform-specific implementation\n\n## Command-Line Arguments\nThe program can take 2 arguments:\n\n- The first one is the path to an image or a path to a folder of images, which means you can simply drag and drop onto the `.exe` to process. If empty, it will ask for a path instead.\n\n- The second one is the path to the config, allowing you to easily switch between different models and modes. If empty, defaults to `config.json` in the same folder of the `.exe`.\n\n## Benchmark\nRunning `4xNomos8kDAT` at `fp32`, with input size of `128` and overlap of `16`, on a **RTX 3060**:\n\n- Upscale a `512x512` image:\n    - Using [ComfyUI](https://github.com/comfyanonymous/ComfyUI): ~11.6s\n    - Using [Forge](https://github.com/lllyasviel/stable-diffusion-webui-forge): ~12.8s\n    - Using **TensorRT**: ~6.2s\n\n- Upscale a `1024x1024` image:\n    - Using [ComfyUI](https://github.com/comfyanonymous/ComfyUI): ~36.5s\n    - Using [Forge](https://github.com/lllyasviel/stable-diffusion-webui-forge): ~36.9s\n    - Using **TensorRT**: ~19.24s\n\n## Roadmap\n- [X] Upgrade to TensorRT 10\n- [X] Upgrade to OpenCV 4.10.0\n- [X] Seamless Tiling\n- [X] Support Folder Processing\n- [X] Support Half Precision I/O\n- [ ] Support Batch Size\n\n\u003chr\u003e\n\n## trtexec\n\n\u003e Extract the `trtexec.exe` from the downloaded TensorRT `.zip`\n\n\u003cdetails\u003e\n\u003csummary\u003eParameters\u003c/summary\u003e\n\n- **--onnx**: Path to the model to convert\n- **--saveEngine**: Path to save the converted engine\n\n\u003cins\u003eOptional\u003c/ins\u003e\n\n- **--shapes**: The shape of the model's input\n    \u003e This is only needed for model with dynamic inputs *(**ie.** the upscale models)*\n    - The first number is batch size\n        \u003e This program currently only supports `1`\n    - The second number is the channel count\n        \u003e This program currently only supports `3` (RGB)\n    - The third and forth numbers are the input dimension of your model\n        \u003e Refer to the model page\n\n- **--inputIOFormats:** Specify the precision of the inputs and the channel order\n\n    \u003e **upscale** mode supports `fp32` and `fp16` I/O; **caption** mode only supports `fp32` I/O\n\n    \u003e Most upscale models are `chw`; the tagger models are `hwc`\n\n- **--outputIOFormats:** Same as above\n\n\u003cins\u003ePrecision\u003c/ins\u003e\n\n\u003e Specify the precision to store the engine weights in\n\n- **(default):** When omitted, defaults to `fp32` full precision\n    \u003e Largest in size; slowest in performance\n\n- **--bf16:** More advanced half precision\n    \u003e Second largest in size; similar performance to `fp32`\n\n    \u003e Requires RTX **30** series or newer GPU\n\n- **--fp16:** Half precision\n    \u003e Almost half in size; almost double in performance\n\n    \u003e Some models may not work properly *(**eg.** the `DAT` upscale models do not work in `fp16`)*\n\n- **--best:** Let `trtexec` determine the precision to use for each layer, including `fp8`\n    \u003e May cause inaccuracy *(**eg.** generate artifacts for upscale models)*\n\n\u003e **I/O** precision and **Weight** precision are independent\n\n\u003c/details\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhaoming02%2Ftensorrt-cpp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhaoming02%2Ftensorrt-cpp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhaoming02%2Ftensorrt-cpp/lists"}