{"id":13576303,"url":"https://github.com/Tencent/TPAT","last_synced_at":"2025-04-05T05:31:27.549Z","repository":{"id":37472003,"uuid":"464319944","full_name":"Tencent/TPAT","owner":"Tencent","description":"TensorRT Plugin Autogen Tool","archived":false,"fork":false,"pushed_at":"2023-04-07T11:19:57.000Z","size":51045,"stargazers_count":369,"open_issues_count":17,"forks_count":42,"subscribers_count":13,"default_branch":"main","last_synced_at":"2025-03-30T10:07:59.104Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Tencent.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2022-02-28T03:03:13.000Z","updated_at":"2025-01-15T05:29:48.000Z","dependencies_parsed_at":"2024-01-19T06:10:01.565Z","dependency_job_id":"6a3f9cde-a9e6-4155-8395-32a6aa505d65","html_url":"https://github.com/Tencent/TPAT","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tencent%2FTPAT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tencent%2FTPAT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tencent%2FTPAT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tencent%2FTPAT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Tencent","download_url":"https://codeload.github.com/Tencent/TPAT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247294319,"owners_count":20915333,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T15:01:09.001Z","updated_at":"2025-04-05T05:31:22.533Z","avatar_url":"https://github.com/Tencent.png","language":"Python","readme":"# TPAT - TensorRT Plugin Autogen Tool\n## Introduction\n1. Automatically generate high-performance TensorRT plugins for unsupported operators or replacing inefficient kernels.\n2. End-to-end command line tool. No requirement for any CUDA programming knowledge. Users only need to provide the ONNX model and assign the node names or types to auto-generate TensorRT plugin.\n3. The performance of auto-generated TensorRT plugins in real cases:\n    * [Performance comparation with hand-written kernels](/docs/Compare_handwritten.md)\n    * [Optimization for TensorRT's original kernels](/docs/Optimize_TensorRT.md)\n\n## Support Matrix\n* [ONNX Operators supported by TPAT-1.0](/docs/Operators.md)\n\n## Runtime Env : dockerfile\n### 1. Build image\n```\nnvidia-docker build .\n```\n### 2. Run container\n```\nnvidia-docker run -itd --gpus all -v \u003cTPAT path dir\u003e:/root \u003cImage_ID\u003e /bin/bash\n```\n### 3. Execute conrainer\n```\nnvidia-docker exec -it \u003cContainer_ID\u003e /bin/bash\n```\n### 4. Modify CUDA_PATH and TRT_PATH in **python/trt_plugin/Makefile**\n```\nCUDA_PATH: local CUDA installation path\nTRT_LIB_PATH: local TensorRT installation path\n```\n### 5. Plugin auto generated\n```\ncd examples\npython test_onehot_dynamic_direct.py\n```\n* tpat_onehot.so is stored in **python/trt_plugin/lib/**\n\n\n## Runtime Env : Build\n### 1. Prerequisites\n#### System Packages\n* LLVM \u003e= 9.0.1, (LLVM==9.0.1 recommended)\n* GCC \u003e= 7.3.0, (GCC==7.4.0 recommended)\n* [TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html)\n\n#### PyPI packages\n* numpy pycuda onnx onnxruntime onnx_graphsurgeon xgboost jinja2 ctypes tornado cloudpickle psutil\n\u003e NOTE: these necessary packages are recorded in requirements.txt\n\n#### Optional packages\n* tensorflow-gpu==1.15\n* tf2onnx\n* torch\n* pytest\n\u003e NOTE: these optional packages are required by Example and UnitTest\n\n### 2. Clone the TPAT repository\n```\ngit clone -b master https://github.com/nvidia/TensorRT TPAT\ncd TPAT\ngit submodule update --init --recursive\n```\n### 3. Build BlazerML-TVM\n```\nmkdir build \u0026\u0026 cp cmake/config.cmake build\n#Edit build/config.cmake to customize the compilation options\nset(USE_LLVM /usr/local/llvm/bin/llvm-config)\nset(USE_CUDA ON)\n#gcc compiler is required to support C++14\ncd build \u0026\u0026 cmake .. \nmake -j\n#TVM Python package\nexport TVM_HOME=/path/to/tvm\nexport PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}\n```\n### 4. Plugin Compiler Env\nModify python/trt_plugin/Makefile according to your environment setup.\n```\nCUDA_PATH: local CUDA installation path\nTRT_LIB_PATH: local TensorRT installation path\n```\n\n## Usage \nTPAT provides a Python function and command line for usage.\n\n### Python function \n```\nonnx2plugin(\n\tinput_model_path, \n\toutput_model_path, \n\tnode_names=None, \n\tnode_types=None, \n\tplugin_name_dict=None,\n\tdynamic_bs=False, # if True, this operator support dynamic batchsize\n\tmin_bs=1,\n\tmax_bs=256,\n\topt_bs=128\n\t)\n```\n\n* input_model_path[*required*] : input onnx model including nodes which require TRT plugin\n* output_model_path[*required*] : output onnx model where the corresponding node types are replaced by plugin names. The output onnx model can be directly converted to TRT with onnx parser and built plugin dynamic library.\n* node_names : list of node names for autogen\n* node_types : list of node types for autogen\n* plugin_name_dict : dict of {plugin_name: node_name} for autogen\n* dynamic_bs : if True, TPAT will generate plugin that supported dynamic batch, if False, generated plugin only support fixed shapes but has better performance.\n* min_bs: the minium batch size in range of dynamic batch.\n* max_bs: the maxium batch size in range of dynamic batch.\n* opt_bs: the optimize batch size in range of dynamic batch.\n\u003e NOTE: For node_names, node_types, plugin_name_dict, at least one of them should be provided\n\n### Command line\n```\n# Separate different ops with spaces\npython3 Onnx2Plugin.py -i input.onnx -o output.onnx -n op_name1 op_name2 -dynamic=true -min=1 -max=512 -opt=256\npython3 Onnx2Plugin.py -i input.onnx -o output.onnx -t op_type1 op_type2 -dynamic=false\npython3 Onnx2Plugin.py -i input.onnx -o output.onnx -p '{\"op_name1\": \"plugin_name1\", \"op_name2\": \"plugin_name2\"}'\n```\n* -i[*required*]: input_model_path\n* -o[*required*]: output_model_path\n* -n: node_names\n* -t: node_types\n* -p: plugin_name_dict\n* -dynamic: dynamic_bs\n* -min: min_bs\n* -max: max_bs\n* -opt: opt_bs\n\n### Output\n#### 1. Assign nodes and plugin names through plugin_name_dict\n* trt_plugin/src contains {plugin_name}.cu and {plugin_name}.h\n* trt_plugin/lib contains {plugin_name}.so\n\n#### 2. Assign node names or node types\n* trt_plugin/src contains tpat_{node_name}.cu and tpat_{node_name}.h\n* trt_plugin/lib contains tpat_{node_name}.so\n\n## Example \u0026\u0026 UnitTest\n* Example : [example_tensorflow.py](/examples/gpu/example_tensorflow.py)\n* UnitTest : [test_tapt.py](/tests/python/unittests/gpu/test_tpat.py)\n\n## Release notes\n### Changelog\n* Support mutiple nodes for autogen\n* Support boolean input/outputs\n* Able to reuse plugins\n\n### Known issues\n* **Only support dynamic BatchSize**\n* Opeartors with int8/float16/double inputs/outputs are not supported\n\n### TODO\n* Support ONNX subgraph for autogen\n* Support direction conversion from TensorFlow and PyTorch","funding_links":[],"categories":["Python","深度学习大类"],"sub_categories":["MLSYS"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FTencent%2FTPAT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FTencent%2FTPAT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FTencent%2FTPAT/lists"}