{"id":14958999,"url":"https://github.com/nvidia-ai-iot/tf_to_trt_image_classification","last_synced_at":"2025-04-05T15:08:56.015Z","repository":{"id":96517844,"uuid":"123506383","full_name":"NVIDIA-AI-IOT/tf_to_trt_image_classification","owner":"NVIDIA-AI-IOT","description":"Image classification with NVIDIA TensorRT from TensorFlow models.","archived":false,"fork":false,"pushed_at":"2020-11-10T02:45:21.000Z","size":61,"stargazers_count":455,"open_issues_count":34,"forks_count":154,"subscribers_count":35,"default_branch":"master","last_synced_at":"2025-04-05T15:08:49.513Z","etag":null,"topics":["benchmark","jetson-tx2","tensorflow","tensorflow-models","tensorrt"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NVIDIA-AI-IOT.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-03-01T23:52:57.000Z","updated_at":"2025-03-18T02:17:57.000Z","dependencies_parsed_at":null,"dependency_job_id":"37b9be32-333d-4f63-9981-8287898e0626","html_url":"https://github.com/NVIDIA-AI-IOT/tf_to_trt_image_classification","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-AI-IOT%2Ftf_to_trt_image_classification","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-AI-IOT%2Ftf_to_trt_image_classification/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-AI-IOT%2Ftf_to_trt_image_classification/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-AI-IOT%2Ftf_to_trt_image_classification/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NVIDIA-AI-IOT","download_url":"https://codeload.github.com/NVIDIA-AI-IOT/tf_to_trt_image_classification/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247353746,"owners_count":20925329,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","jetson-tx2","tensorflow","tensorflow-models","tensorrt"],"created_at":"2024-09-24T13:18:40.339Z","updated_at":"2025-04-05T15:08:55.975Z","avatar_url":"https://github.com/NVIDIA-AI-IOT.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"TensorFlow-\u003eTensorRT Image Classification\n===\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"data/landing_graphic.jpg\" alt=\"landing graphic\" height=\"150px\"/\u003e\n\u003c/p\u003e\n\nThis contains examples, scripts and code related to image classification using TensorFlow models\n(from [here](https://github.com/tensorflow/models/tree/master/research/slim#Pretrained))\nconverted to TensorRT.  Converting TensorFlow models to TensorRT offers significant performance\ngains on the Jetson TX2 as seen [below](#models).\n\n* [Models](#models)\n* [Setup](#install)\n* [Download models and create frozen graphs](#download)\n* [Convert frozen graph to TensorRT engine](#convert)\n* [Execute TensorRT engine](#execute)\n* [Benchmark all models](#benchmark)\n\n\u003ca name=\"models\"\u003e\u003c/a\u003e\n## Models\n\nThe table below shows various details related to pretrained models ported from the TensorFlow \nslim model zoo.  \n\n| \u003csub\u003eModel\u003c/sub\u003e | \u003csub\u003eInput Size\u003c/sub\u003e | \u003csub\u003eTensorRT (TX2 / Half)\u003c/sub\u003e | \u003csub\u003eTensorRT (TX2 / Float)\u003c/sub\u003e | \u003csub\u003eTensorFlow (TX2 / Float)\u003c/sub\u003e | \u003csub\u003eInput Name\u003c/sub\u003e | \u003csub\u003eOutput Name\u003c/sub\u003e | \u003csub\u003ePreprocessing Fn.\u003c/sub\u003e |\n|--- |:---:|:---:|:---:|:---:|---|---|---|\n| \u003csub\u003einception_v1\u003c/sub\u003e | \u003csub\u003e224x224\u003c/sub\u003e | \u003csub\u003e7.98ms\u003c/sub\u003e | \u003csub\u003e12.8ms\u003c/sub\u003e | \u003csub\u003e27.6ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eInceptionV1/Logits/SpatialSqueeze\u003c/sub\u003e | \u003csub\u003einception\u003c/sub\u003e |\n| \u003csub\u003einception_v3\u003c/sub\u003e | \u003csub\u003e299x299\u003c/sub\u003e | \u003csub\u003e26.3ms\u003c/sub\u003e | \u003csub\u003e46.1ms\u003c/sub\u003e | \u003csub\u003e98.4ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eInceptionV3/Logits/SpatialSqueeze\u003c/sub\u003e | \u003csub\u003einception\u003c/sub\u003e |\n| \u003csub\u003einception_v4\u003c/sub\u003e | \u003csub\u003e299x299\u003c/sub\u003e | \u003csub\u003e52.1ms\u003c/sub\u003e | \u003csub\u003e88.2ms\u003c/sub\u003e | \u003csub\u003e176ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eInceptionV4/Logits/Logits/BiasAdd\u003c/sub\u003e | \u003csub\u003einception\u003c/sub\u003e |\n| \u003csub\u003einception_resnet_v2\u003c/sub\u003e | \u003csub\u003e299x299\u003c/sub\u003e | \u003csub\u003e53.0ms\u003c/sub\u003e | \u003csub\u003e98.7ms\u003c/sub\u003e | \u003csub\u003e168ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eInceptionResnetV2/Logits/Logits/BiasAdd\u003c/sub\u003e | \u003csub\u003einception\u003c/sub\u003e |\n| \u003csub\u003eresnet_v1_50\u003c/sub\u003e | \u003csub\u003e224x224\u003c/sub\u003e | \u003csub\u003e15.7ms\u003c/sub\u003e | \u003csub\u003e27.1ms\u003c/sub\u003e | \u003csub\u003e63.9ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eresnet_v1_50/SpatialSqueeze\u003c/sub\u003e | \u003csub\u003evgg\u003c/sub\u003e |\n| \u003csub\u003eresnet_v1_101\u003c/sub\u003e | \u003csub\u003e224x224\u003c/sub\u003e | \u003csub\u003e29.9ms\u003c/sub\u003e | \u003csub\u003e51.8ms\u003c/sub\u003e | \u003csub\u003e107ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eresnet_v1_101/SpatialSqueeze\u003c/sub\u003e | \u003csub\u003evgg\u003c/sub\u003e |\n| \u003csub\u003eresnet_v1_152\u003c/sub\u003e | \u003csub\u003e224x224\u003c/sub\u003e | \u003csub\u003e42.6ms\u003c/sub\u003e | \u003csub\u003e78.2ms\u003c/sub\u003e | \u003csub\u003e157ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eresnet_v1_152/SpatialSqueeze\u003c/sub\u003e | \u003csub\u003evgg\u003c/sub\u003e |\n| \u003csub\u003eresnet_v2_50\u003c/sub\u003e | \u003csub\u003e299x299\u003c/sub\u003e | \u003csub\u003e27.5ms\u003c/sub\u003e | \u003csub\u003e44.4ms\u003c/sub\u003e | \u003csub\u003e92.2ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eresnet_v2_50/SpatialSqueeze\u003c/sub\u003e | \u003csub\u003einception\u003c/sub\u003e |\n| \u003csub\u003eresnet_v2_101\u003c/sub\u003e | \u003csub\u003e299x299\u003c/sub\u003e | \u003csub\u003e49.2ms\u003c/sub\u003e | \u003csub\u003e83.1ms\u003c/sub\u003e | \u003csub\u003e160ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eresnet_v2_101/SpatialSqueeze\u003c/sub\u003e | \u003csub\u003einception\u003c/sub\u003e |\n| \u003csub\u003eresnet_v2_152\u003c/sub\u003e | \u003csub\u003e299x299\u003c/sub\u003e | \u003csub\u003e74.6ms\u003c/sub\u003e | \u003csub\u003e124ms\u003c/sub\u003e | \u003csub\u003e230ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eresnet_v2_152/SpatialSqueeze\u003c/sub\u003e | \u003csub\u003einception\u003c/sub\u003e |\n| \u003csub\u003emobilenet_v1_0p25_128\u003c/sub\u003e | \u003csub\u003e128x128\u003c/sub\u003e | \u003csub\u003e2.67ms\u003c/sub\u003e | \u003csub\u003e2.65ms\u003c/sub\u003e | \u003csub\u003e15.7ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eMobilenetV1/Logits/SpatialSqueeze\u003c/sub\u003e | \u003csub\u003einception\u003c/sub\u003e |\n| \u003csub\u003emobilenet_v1_0p5_160\u003c/sub\u003e | \u003csub\u003e160x160\u003c/sub\u003e | \u003csub\u003e3.95ms\u003c/sub\u003e | \u003csub\u003e4.00ms\u003c/sub\u003e | \u003csub\u003e16.9ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eMobilenetV1/Logits/SpatialSqueeze\u003c/sub\u003e | \u003csub\u003einception\u003c/sub\u003e |\n| \u003csub\u003emobilenet_v1_1p0_224\u003c/sub\u003e | \u003csub\u003e224x224\u003c/sub\u003e | \u003csub\u003e12.9ms\u003c/sub\u003e | \u003csub\u003e12.9ms\u003c/sub\u003e | \u003csub\u003e24.4ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003eMobilenetV1/Logits/SpatialSqueeze\u003c/sub\u003e | \u003csub\u003einception\u003c/sub\u003e |\n| \u003csub\u003evgg_16\u003c/sub\u003e | \u003csub\u003e224x224\u003c/sub\u003e | \u003csub\u003e38.2ms\u003c/sub\u003e | \u003csub\u003e79.2ms\u003c/sub\u003e | \u003csub\u003e171ms\u003c/sub\u003e | \u003csub\u003einput\u003c/sub\u003e | \u003csub\u003evgg_16/fc8/BiasAdd\u003c/sub\u003e | \u003csub\u003evgg\u003c/sub\u003e |\n\n\u003c!--| inception_v2 | 224x224 | 10.3ms | 16.9ms | 38.3ms | input | InceptionV2/Logits/SpatialSqueeze | inception |--\u003e\n\u003c!--| vgg_19 | 224x224 | 97.3ms | OOM | input | vgg_19/fc8/BiasAdd | vgg |--\u003e\n\n\nThe times recorded include data transfer to GPU, network execution, and\ndata transfer back from GPU.  Time does not include preprocessing. \nSee [scripts/test_tf.py](scripts/test_tf.py), [scripts/test_trt.py](scripts/test_trt.py), and [src/test/test_trt.cu](src/test/test_trt.cu) \nfor implementation details. \n\n\u003ca name=\"install\"\u003e\u003c/a\u003e\n## Setup\n\n1. Flash the Jetson TX2 using JetPack 3.2.  Be sure to install\n   * CUDA 9.0\n   * OpenCV4Tegra\n   * cuDNN\n   * TensorRT 3.0\n\n2. Install pip on Jetson TX2.\n    ```\n    sudo apt-get install python-pip\n    ```\n\n3. Install TensorFlow on Jetson TX2.\n   1. Download the TensorFlow 1.5.0 pip wheel from [here](https://drive.google.com/open?id=1ZYUJqcFdJytdMCQ5bVDtb3KoTqc_cugG).  This build of TensorFlow is provided as a convenience for the purposes of this project.\n   2. Install TensorFlow using pip\n  \n            sudo pip install tensorflow-1.5.0rc0-cp27-cp27mu-linux_aarch64.whl\n\n4. Install uff exporter on Jetson TX2.\n   1. Download TensorRT 3.0.4 for Ubuntu 16.04 and CUDA 9.0 tar package from https://developer.nvidia.com/nvidia-tensorrt-download.\n   2. Extract archive \n\n            tar -xzf TensorRT-3.0.4.Ubuntu-16.04.3.x86_64.cuda-9.0.cudnn7.0.tar.gz\n\n   3. Install uff python package using pip \n\n            sudo pip install TensorRT-3.0.4/uff/uff-0.2.0-py2.py3-none-any.whl\n\n5. Clone and build this project\n\n    ```\n    git clone --recursive https://github.com/NVIDIA-Jetson/tf_to_trt_image_classification.git\n    cd tf_to_trt_image_classification\n    mkdir build\n    cd build\n    cmake ..\n    make \n    cd ..\n    ```\n\n\u003ca name=\"download\"\u003e\u003c/a\u003e\n## Download models and create frozen graphs\n\nRun the following bash script to download all of the pretrained models. \n\n```\nsource scripts/download_models.sh\n``` \n\nIf there are any models you don't want to use, simply remove the URL from the model list in [scripts/download_models.sh](scripts/download_models.sh).  \nNext, because the TensorFlow models are provided in checkpoint format, we must convert them to frozen graphs for optimization with TensorRT.  Run the [scripts/models_to_frozen_graphs.py](scripts/models_to_frozen_graphs.py) script.  \n\n```\npython scripts/models_to_frozen_graphs.py\n```\n\nIf you removed any models in the previous step, you must add ``'exclude': true`` to the corresponding item in the [NETS](scripts/model_meta.py#L67) dictionary located in [scripts/model_meta.py](scripts/model_meta.py).  If you are following the instructions for executing engines below, you may also need some sample images.  Run the following script to download a few images from ImageNet.\n\n```\nsource scripts/download_images.sh\n```\n\n\u003ca name=\"convert\"\u003e\u003c/a\u003e\n## Convert frozen graph to TensorRT engine\n\nRun the [scripts/convert_plan.py](scripts/convert_plan.py) script from the root directory of the project, referencing the [models table](#models) for relevant parameters.  For example, to convert the Inception V1 model run the following\n\n```\npython scripts/convert_plan.py data/frozen_graphs/inception_v1.pb data/plans/inception_v1.plan input 224 224 InceptionV1/Logits/SpatialSqueeze 1 0 float\n```\n\nThe inputs to the convert_plan.py script are\n\n1. frozen graph path\n2. output plan path\n3. input node name\n4. input height\n5. input width\n6. output node name\n7. max batch size\n8. max workspace size\n9. data type (float or half)\n\nThis script assumes single output single input image models, and may not work out of the box for models other than those in the table above.\n\n\u003ca name=\"execute\"\u003e\u003c/a\u003e\n## Execute TensorRT engine\n\nCall the [examples/classify_image](examples/classify_image) program from the root directory of the project, referencing the [models table](#models) for relevant parameters.  For example, to run the Inception V1 model converted as above\n\n```\n./build/examples/classify_image/classify_image data/images/gordon_setter.jpg data/plans/inception_v1.plan data/imagenet_labels_1001.txt input InceptionV1/Logits/SpatialSqueeze inception\n```\n\nFor reference, the inputs to the example program are\n\n1. input image path\n2. plan file path\n3. labels file (one label per line, line number corresponds to index in output)\n4. input node name\n5. output node name\n6. preprocessing function (either vgg or inception)\n\nWe provide two image label files in the [data folder](data/).  Some of the TensorFlow models were trained with an additional \"background\" class, causing the model to have 1001 outputs instead of 1000.  To determine the number of outputs for each model, reference the [NETS](scripts/model_meta.py#L67) variable in [scripts/model_meta.py](scripts/model_meta.py).\n\n\u003ca name=\"benchmark\"\u003e\u003c/a\u003e\n## Benchmark all models\n\nTo benchmark all of the models, first convert all of the models that you [downloaded above](#download) into TensorRT engines.  Run the following script to convert all models\n\n```\npython scripts/frozen_graphs_to_plans.py\n```\n\nIf you want to change parameters related to TensorRT optimization, just edit the [scripts/frozen_graphs_to_plans.py](scripts/frozen_graphs_to_plans.py) file.\nNext, to benchmark all of the models run the [scripts/test_trt.py](scripts/test_trt.py) script\n\n```\npython scripts/test_trt.py\n```\n\nOnce finished, the timing results will be stored at **data/test_output_trt.txt**.\nIf you want to also benchmark the TensorFlow models, simply run.\n\n```\npython scripts/test_tf.py\n```\n\nThe results will be stored at **data/test_output_tf.txt**.  This benchmarking script loads an example image as input, make sure you have downloaded the sample images as [above](#download).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnvidia-ai-iot%2Ftf_to_trt_image_classification","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnvidia-ai-iot%2Ftf_to_trt_image_classification","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnvidia-ai-iot%2Ftf_to_trt_image_classification/lists"}