{"id":17980911,"url":"https://github.com/nvidia/retinanet-examples","last_synced_at":"2025-04-12T14:56:14.823Z","repository":{"id":47912689,"uuid":"175702160","full_name":"NVIDIA/retinanet-examples","owner":"NVIDIA","description":"Fast and accurate object detection with end-to-end GPU optimization","archived":false,"fork":false,"pushed_at":"2021-09-29T02:23:14.000Z","size":432,"stargazers_count":893,"open_issues_count":32,"forks_count":268,"subscribers_count":43,"default_branch":"main","last_synced_at":"2025-04-03T14:11:02.869Z","etag":null,"topics":["deep-learning","neural-network","object-detection","python","pytorch","retinanet","tensorrt"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NVIDIA.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-03-14T21:26:20.000Z","updated_at":"2025-04-01T03:36:59.000Z","dependencies_parsed_at":"2022-08-25T04:20:48.389Z","dependency_job_id":null,"html_url":"https://github.com/NVIDIA/retinanet-examples","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA%2Fretinanet-examples","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA%2Fretinanet-examples/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA%2Fretinanet-examples/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA%2Fretinanet-examples/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NVIDIA","download_url":"https://codeload.github.com/NVIDIA/retinanet-examples/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248586244,"owners_count":21128996,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","neural-network","object-detection","python","pytorch","retinanet","tensorrt"],"created_at":"2024-10-29T18:06:52.731Z","updated_at":"2025-04-12T14:56:14.798Z","avatar_url":"https://github.com/NVIDIA.png","language":"Python","readme":"# NVIDIA Object Detection Toolkit (ODTK)\n\n**Fast** and **accurate** single stage object detection with end-to-end GPU optimization.\n\n## Description\n\nODTK is a single shot object detector with various backbones and detection heads. This allows performance/accuracy trade-offs.\n\nIt is optimized for end-to-end GPU processing using:\n* The [PyTorch](https://pytorch.org) deep learning framework with [ONNX](https://onnx.ai) support\n* NVIDIA [Apex](https://github.com/NVIDIA/apex) for mixed precision and distributed training\n* NVIDIA [DALI](https://github.com/NVIDIA/DALI) for optimized data pre-processing\n* NVIDIA [TensorRT](https://developer.nvidia.com/tensorrt) for high-performance inference\n* NVIDIA [DeepStream](https://developer.nvidia.com/deepstream-sdk) for optimized real-time video streams support\n\n## Rotated bounding box detections\n\nThis repo now supports rotated bounding box detections. See [rotated detections training](TRAINING.md#rotated-detections) and [rotated detections inference](INFERENCE.md#rotated-detections) documents for more information on how to use the `--rotated-bbox` command. \n\nBounding box annotations are described by `[x, y, w, h, theta]`. \n\n## Performance\n\nThe detection pipeline allows the user to select a specific backbone depending on the latency-accuracy trade-off preferred.\n\nODTK **RetinaNet** model accuracy and inference latency \u0026 FPS (frames per seconds) for [COCO 2017](http://cocodataset.org/#detection-2017) (train/val) after full training schedule. Inference results include bounding boxes post-processing for a batch size of 1. Inference measured at `--resize 800` using `--with-dali` on a FP16 TensorRT engine.\n\nBackbone |  mAP @[IoU=0.50:0.95] | Training Time on [DGX1v](https://www.nvidia.com/en-us/data-center/dgx-1/) | Inference latency FP16 on [V100](https://www.nvidia.com/en-us/data-center/tesla-v100/) | Inference latency INT8 on [T4](https://www.nvidia.com/en-us/data-center/tesla-t4/) | Inference latency FP16 on [A100](https://www.nvidia.com/en-us/data-center/a100/) | Inference latency INT8 on [A100](https://www.nvidia.com/en-us/data-center/a100/)\n--- | :---: | :---: | :---: | :---: | :---: | :---:\n[ResNet18FPN](https://github.com/NVIDIA/retinanet-examples/releases/download/19.04/retinanet_rn18fpn.zip) | 0.318 | 5 hrs  | 14 ms;\u003c/br\u003e71 FPS | 18 ms;\u003c/br\u003e56 FPS | 9 ms;\u003c/br\u003e110 FPS | 7 ms;\u003c/br\u003e141 FPS\n[MobileNetV2FPN](https://github.com/NVIDIA/retinanet-examples/releases/download/v0.2.3/retinanet_mobilenetv2fpn.pth) | 0.333 | | 14 ms;\u003c/br\u003e74 FPS | 18 ms;\u003c/br\u003e56 FPS | 9 ms;\u003c/br\u003e114 FPS | 7 ms;\u003c/br\u003e138 FPS\n[ResNet34FPN](https://github.com/NVIDIA/retinanet-examples/releases/download/19.04/retinanet_rn34fpn.zip) | 0.343 | 6 hrs  | 16 ms;\u003c/br\u003e64 FPS | 20 ms;\u003c/br\u003e50 FPS | 10 ms;\u003c/br\u003e103 FPS | 7 ms;\u003c/br\u003e142 FPS\n[ResNet50FPN](https://github.com/NVIDIA/retinanet-examples/releases/download/19.04/retinanet_rn50fpn.zip) | 0.358 | 7 hrs  | 18 ms;\u003c/br\u003e56 FPS | 22 ms;\u003c/br\u003e45 FPS | 11 ms;\u003c/br\u003e93 FPS | 8 ms;\u003c/br\u003e129 FPS\n[ResNet101FPN](https://github.com/NVIDIA/retinanet-examples/releases/download/19.04/retinanet_rn101fpn.zip) | 0.376 | 10 hrs | 22 ms;\u003c/br\u003e46 FPS | 27 ms;\u003c/br\u003e37 FPS | 13 ms;\u003c/br\u003e78 FPS | 9 ms;\u003c/br\u003e117 FPS\n[ResNet152FPN](https://github.com/NVIDIA/retinanet-examples/releases/download/19.04/retinanet_rn152fpn.zip) | 0.393 | 12 hrs | 26 ms;\u003c/br\u003e38 FPS | 33 ms;\u003c/br\u003e31 FPS | 15 ms;\u003c/br\u003e66 FPS | 10 ms;\u003c/br\u003e103 FPS\n\n## Installation\n\nFor best performance, use the latest [PyTorch NGC docker container](https://ngc.nvidia.com/catalog/containers/nvidia:pytorch). Clone this repository, build and run your own image:\n\n```bash\ngit clone https://github.com/nvidia/retinanet-examples\ndocker build -t odtk:latest retinanet-examples/\ndocker run --gpus all --rm --ipc=host -it odtk:latest\n```\n\n## Usage\n\nTraining, inference, evaluation and model export can be done through the `odtk` utility. \nFor more details, including a list of parameters, please refer to the [TRAINING](TRAINING.md) and [INFERENCE](INFERENCE.md) documentation.\n\n### Training\n\nTrain a detection model on [COCO 2017](http://cocodataset.org/#download) from pre-trained backbone:\n```bash\nodtk train retinanet_rn50fpn.pth --backbone ResNet50FPN \\\n    --images /coco/images/train2017/ --annotations /coco/annotations/instances_train2017.json \\\n    --val-images /coco/images/val2017/ --val-annotations /coco/annotations/instances_val2017.json\n```\n\n### Fine Tuning\n\nFine-tune a pre-trained model on your dataset. In the example below we use [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html) with [JSON annotations](https://storage.googleapis.com/coco-dataset/external/PASCAL_VOC.zip):\n```bash\nodtk train model_mydataset.pth --backbone ResNet50FPN \\\n    --fine-tune retinanet_rn50fpn.pth \\\n    --classes 20 --iters 10000 --val-iters 1000 --lr 0.0005 \\\n    --resize 512 --jitter 480 640 --images /voc/JPEGImages/ \\\n    --annotations /voc/pascal_train2012.json --val-annotations /voc/pascal_val2012.json\n```\n\nNote: the shorter side of the input images will be resized to `resize` as long as the longer side doesn't get larger than `max-size`. During training, the images will be randomly randomly resized to a new size within the `jitter` range.\n\n### Inference\n\nEvaluate your detection model on [COCO 2017](http://cocodataset.org/#download):\n```bash\nodtk infer retinanet_rn50fpn.pth --images /coco/images/val2017/ --annotations /coco/annotations/instances_val2017.json\n```\n\nRun inference on [your dataset](#datasets):\n```bash\nodtk infer retinanet_rn50fpn.pth --images /dataset/val --output detections.json\n```\n\n### Optimized Inference with TensorRT\n\nFor faster inference, export the detection model to an optimized FP16 TensorRT engine:\n```bash\nodtk export model.pth engine.plan\n```\n\nEvaluate the model with TensorRT backend on [COCO 2017](http://cocodataset.org/#download):\n```bash\nodtk infer engine.plan --images /coco/images/val2017/ --annotations /coco/annotations/instances_val2017.json\n```\n\n### INT8 Inference with TensorRT\n\nFor even faster inference, do INT8 calibration to create an optimized INT8 TensorRT engine:\n```bash\nodtk export model.pth engine.plan --int8 --calibration-images /coco/images/val2017/\n```\nThis will create an INT8CalibrationTable file that can be used to create INT8 TensorRT engines for the same model later on without needing to do calibration.\n\nOr create an optimized INT8 TensorRT engine using a cached calibration table:\n```bash\nodtk export model.pth engine.plan --int8 --calibration-table /path/to/INT8CalibrationTable\n```\n\n## Datasets\n\nRetinaNet supports annotations in the [COCO JSON format](http://cocodataset.org/#format-data).\nWhen converting the annotations from your own dataset into JSON, the following entries are required:\n```\n{\n    \"images\": [{\n        \"id\" : int,\n        \"file_name\" : str\n    }],\n    \"annotations\": [{\n        \"id\" : int,\n        \"image_id\" : int, \n        \"category_id\" : int,\n        \"bbox\" : [x, y, w, h]   # all floats\n        \"area\": float           # w * h. Required for validation scores\n        \"iscrowd\": 0            # Required for validation scores\n    }],\n    \"categories\": [{\n        \"id\" : int\n    ]}\n}\n```\n\nIf using the `--rotated-bbox` flag for rotated detections, add an additional float `theta` to the annotations. To get validation scores you also need to fill the `segmentation` section.\n```\n        \"bbox\" : [x, y, w, h, theta]    # all floats, where theta is measured in radians anti-clockwise from the x-axis.\n        \"segmentation\" : [[x1, y1, x2, y2, x3, y3, x4, y4]]\n                                        # Required for validation scores.\n```\n\n## Disclaimer\n\nThis is a research project, not an official NVIDIA product.\n\n## Jetpack compatibility\n\nThis branch uses TensorRT 7. If you are training and inferring models using PyTorch, or are creating TensorRT engines on Tesla GPUs (eg V100, T4), then you should use this branch.\n\nIf you wish to deploy your model to a Jetson device (eg - Jetson AGX Xavier) running Jetpack version 4.3, then you should use the `19.10` branch of this repo.\n\n## References\n\n- [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002).\n  Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár.\n  ICCV, 2017.\n- [Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour](https://arxiv.org/abs/1706.02677).\n  Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, Kaiming He.\n  June 2017.\n- [Feature Pyramid Networks for Object Detection](https://arxiv.org/abs/1612.03144).\n  Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, Serge Belongie.\n  CVPR, 2017.\n- [Deep Residual Learning for Image Recognition](http://arxiv.org/abs/1512.03385).\n  Kaiming He, Xiangyu Zhang, Shaoqing Renm Jian Sun.\n  CVPR, 2016.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnvidia%2Fretinanet-examples","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnvidia%2Fretinanet-examples","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnvidia%2Fretinanet-examples/lists"}