{"id":19372566,"url":"https://github.com/wisconsinaivision/yolact_edge","last_synced_at":"2025-05-16T02:08:37.863Z","repository":{"id":41481602,"uuid":"323777821","full_name":"WisconsinAIVision/yolact_edge","owner":"WisconsinAIVision","description":"The first competitive instance segmentation approach that runs on small edge devices at real-time speeds.","archived":false,"fork":false,"pushed_at":"2022-12-07T02:53:17.000Z","size":24612,"stargazers_count":1294,"open_issues_count":74,"forks_count":275,"subscribers_count":28,"default_branch":"master","last_synced_at":"2025-04-01T11:04:34.373Z","etag":null,"topics":["edge-devices","instance-segmentation","pytorch","real-time","realtime","yolactedge"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/WisconsinAIVision.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-12-23T02:02:43.000Z","updated_at":"2025-03-26T12:26:33.000Z","dependencies_parsed_at":"2022-07-13T08:21:25.084Z","dependency_job_id":null,"html_url":"https://github.com/WisconsinAIVision/yolact_edge","commit_stats":null,"previous_names":["wisconsinaivision/yolact_edge","haotian-liu/yolact_edge"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WisconsinAIVision%2Fyolact_edge","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WisconsinAIVision%2Fyolact_edge/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WisconsinAIVision%2Fyolact_edge/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WisconsinAIVision%2Fyolact_edge/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/WisconsinAIVision","download_url":"https://codeload.github.com/WisconsinAIVision/yolact_edge/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247838446,"owners_count":21004580,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["edge-devices","instance-segmentation","pytorch","real-time","realtime","yolactedge"],"created_at":"2024-11-10T08:24:03.420Z","updated_at":"2025-04-08T12:12:18.582Z","avatar_url":"https://github.com/WisconsinAIVision.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# YolactEdge: Real-time Instance Segmentation on the Edge\n```\n██╗   ██╗ ██████╗ ██╗      █████╗  ██████╗████████╗    ███████╗██████╗  ██████╗ ███████╗\n╚██╗ ██╔╝██╔═══██╗██║     ██╔══██╗██╔════╝╚══██╔══╝    ██╔════╝██╔══██╗██╔════╝ ██╔════╝\n ╚████╔╝ ██║   ██║██║     ███████║██║        ██║       █████╗  ██║  ██║██║  ███╗█████╗  \n  ╚██╔╝  ██║   ██║██║     ██╔══██║██║        ██║       ██╔══╝  ██║  ██║██║   ██║██╔══╝  \n   ██║   ╚██████╔╝███████╗██║  ██║╚██████╗   ██║       ███████╗██████╔╝╚██████╔╝███████╗\n   ╚═╝    ╚═════╝ ╚══════╝╚═╝  ╚═╝ ╚═════╝   ╚═╝       ╚══════╝╚═════╝  ╚═════╝ ╚══════╝\n```\n\n**YolactEdge**, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds. Specifically, YolactEdge runs at up to 30.8 FPS on a Jetson AGX Xavier (and 172.7 FPS on an RTX 2080 Ti) with a ResNet-101 backbone on 550x550 resolution images. This is the code for [our paper](https://arxiv.org/abs/2012.12259).\n\n**For a real-time demo and more samples, check out our [demo video](https://www.youtube.com/watch?v=GBCK9SrcCLM).**\n\n[![example-gif-1](data/yolact_edge_example_1.gif)](https://www.youtube.com/watch?v=GBCK9SrcCLM)\n\n[![example-gif-2](data/yolact_edge_example_2.gif)](https://www.youtube.com/watch?v=GBCK9SrcCLM)\n\n[![example-gif-3](data/yolact_edge_example_3.gif)](https://www.youtube.com/watch?v=GBCK9SrcCLM)\n\n## Model Zoo\n\nWe provide baseline YOLACT and YolactEdge models trained on COCO and YouTube VIS (our sub-training split, with COCO joint training).\n\nTo evalute the model, put the corresponding weights file in the `./weights` directory and run one of the following commands.\n\nYouTube VIS models:\n\n| Method | Backbone\u0026nbsp; | mAP | AGX-Xavier FPS | RTX 2080 Ti FPS | weights |\n|:-------------:|:-------------:|:----:|:----:|:----:|----------------------------------------------------------------------------------------------------------------------|\n| YOLACT | R-50-FPN | 44.7 | 8.5 | 59.8 | [download](https://drive.google.com/file/d/1EfoQ0OteuQdY2yU9Od8XHTHrizQVFR2w/view?usp=sharing) \\| [mirror](https://1drv.ms/u/s!AkSxI62eEcpbiHLweuem6riY6lVK?e=cUGBRf) |\n| YolactEdge \u003cbr\u003e(w/o TRT) | R-50-FPN | 44.2| 10.5 | 67.0 | [download](https://drive.google.com/file/d/1qvd4W28yzzXFb2wwGfYySv5HHzGU26XP/view?usp=sharing) \\| [mirror](https://1drv.ms/u/s!AkSxI62eEcpbiHGgB-KrQLubo7eZ?e=h26XJM) |\n| YolactEdge | R-50-FPN | 44.0| 32.4 | 177.6 | [download](https://drive.google.com/file/d/1qvd4W28yzzXFb2wwGfYySv5HHzGU26XP/view?usp=sharing) \\| [mirror](https://1drv.ms/u/s!AkSxI62eEcpbiHGgB-KrQLubo7eZ?e=h26XJM) |\n| YOLACT | R-101-FPN | 47.3 | 5.9 | 42.6 | [download](https://drive.google.com/file/d/1doS5MRhpSs4puVCuzR5i3GrDMSxcw7Lx/view?usp=sharing) \\| [mirror](https://1drv.ms/u/s!AkSxI62eEcpbiHOei4kogT1JCfO7?e=dLcrVg) |\n| YolactEdge \u003cbr\u003e(w/o TRT) | R-101-FPN | 46.9| 9.5 | 61.2 | [download](https://drive.google.com/file/d/1mSxesVaMmYc13cPHiEnRvubPxy8WBjJW/view?usp=sharing) \\| [mirror](https://1drv.ms/u/s!AkSxI62eEcpbiHAqrmvsL1RMH9WK?e=Tnlu7p) |\n| YolactEdge | R-101-FPN | 46.2 | 30.8 | 172.7 | [download](https://drive.google.com/file/d/1mSxesVaMmYc13cPHiEnRvubPxy8WBjJW/view?usp=sharing) \\| [mirror](https://1drv.ms/u/s!AkSxI62eEcpbiHAqrmvsL1RMH9WK?e=Tnlu7p) |\n\nCOCO models:\n\n| Method | \u0026nbsp;\u0026nbsp;\u0026nbsp;Backbone\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp; | mAP | Titan Xp FPS | AGX-Xavier FPS | RTX 2080 Ti FPS | weights |\n|:-------------:|:-------------:|:----:|:----:|:----:|:----:|----------------------------------------------------------------------------------------------------------------------|\n| YOLACT | MobileNet-V2 | 22.1 | - | 15.0 | 35.7 | [download](https://drive.google.com/file/d/1L4N4VcykqE-D5JUgWW9zBd6WKmZPBAZQ/view?usp=sharing) \\| [mirror](https://1drv.ms/u/s!AkSxI62eEcpbiG8nFXtvgAkI-c1H?e=RraXLv) |\n| YolactEdge | MobileNet-V2 | 20.8 | - | 35.7 | 161.4 | [download](https://drive.google.com/file/d/1L4N4VcykqE-D5JUgWW9zBd6WKmZPBAZQ/view?usp=sharing) \\| [mirror](https://1drv.ms/u/s!AkSxI62eEcpbiG8nFXtvgAkI-c1H?e=RraXLv) |\n| YOLACT | R-50-FPN | 28.2 | 42.5 | 9.1 | 45.0 | [download](https://drive.google.com/file/d/15TRS8MNNe3pmjilonRy9OSdJdCPl5DhN/view?usp=sharing) \\| [mirror](https://1drv.ms/u/s!AkSxI62eEcpbiG5ZnhPTSkqBCURo?e=lNOaXr) |\n| YolactEdge | R-50-FPN | 27.0| - | 30.7 | 140.3 | [download](https://drive.google.com/file/d/15TRS8MNNe3pmjilonRy9OSdJdCPl5DhN/view?usp=sharing) \\| [mirror](https://1drv.ms/u/s!AkSxI62eEcpbiG5ZnhPTSkqBCURo?e=lNOaXr) |\n| YOLACT | R-101-FPN | 29.8 | 33.5 | 6.6 | 36.5 | [download](https://drive.google.com/file/d/1EAzO-vRDZ2hupUJ4JFSUi40lAZ5Jo-Bp/view?usp=sharing) \\| [mirror](https://1drv.ms/u/s!AkSxI62eEcpbiG8nFXtvgAkI-c1H?e=HyfH8Z) |\n| YolactEdge | R-101-FPN | 29.5 | - | 27.3 | 124.8 | [download](https://drive.google.com/file/d/1EAzO-vRDZ2hupUJ4JFSUi40lAZ5Jo-Bp/view?usp=sharing) \\| [mirror](https://1drv.ms/u/s!AkSxI62eEcpbiG8nFXtvgAkI-c1H?e=HyfH8Z) |\n\n## Installation\n\nSee [INSTALL.md](INSTALL.md).\n\nOptionally, you can use the official [Dockerfile](docker) to set up full enivronment with one command.\n\n## Getting Started\n\nFollow the [installation instructions](INSTALL.md) to set up required environment for running YolactEdge.\n\nSee instructions to [evaluate](https://github.com/haotian-liu/yolact_edge#evaluation) and [train](https://github.com/haotian-liu/yolact_edge#training) with YolactEdge.\n\n### Colab Notebook\n\nTry out our [Colab Notebook](https://colab.research.google.com/drive/1Mzst4q4Y-SQszIHhlEv1CkT4hwja4GNw?usp=sharing) with a live demo to learn about basic usage.\n\nIf you are interested in evaluating YolactEdge with TensorRT, we provide another [Colab Notebook](https://colab.research.google.com/drive/1nEZAYnGbF7VetqltAlUTyAGTI71MvPPF?usp=sharing) with TensorRT environment configuration on Colab.\n\n## Evaluation\n\n### Quantitative Results\n```Shell\n# Convert each component of the trained model to TensorRT using the optimal settings and evaluate on the YouTube VIS validation set (our split).\npython3 eval.py --trained_model=./weights/yolact_edge_vid_847_50000.pth\n\n# Evaluate on the entire COCO validation set.\npython3 eval.py --trained_model=./weights/yolact_edge_54_800000.pth\n\n# Output a COCO JSON file for the COCO test-dev. The command will create './results/bbox_detections.json' and './results/mask_detections.json' for detection and instance segmentation respectively. These files can then be submitted to the website for evaluation.\npython3 eval.py --trained_model=./weights/yolact_edge_54_800000.pth --dataset=coco2017_testdev_dataset --output_coco_json\n```\n\n### Qualitative Results\n```Shell\n# Display qualitative results on COCO. From here on I'll use a confidence threshold of 0.3.\npython eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --display\n```\n\n### Benchmarking\n\n```Shell\n# Benchmark the trained model on the COCO validation set.\n# Run just the raw model on the first 1k images of the validation set\npython eval.py --trained_model=weights/yolact_edge_54_800000.pth --benchmark --max_images=1000\n```\n\n### Notes\n\n#### Handling inference error when using TensorRT\nIf you are using TensorRT conversion of YolactEdge and encountered issue in PostProcessing or NMS stage, this might be related to TensorRT engine issues. We implemented a experimental safe mode that will handle these cases carefully. Try this out with `--use_tensorrt_safe_mode` option in your command.\n\n\n#### Inference using models trained with YOLACT\nIf you have a pre-trained model with [YOLACT](https://github.com/dbolya/yolact), and you want to take advantage of either TensorRT feature of YolactEdge, simply specify the `--config=yolact_edge_config` in command line options, and the code will automatically detect and convert the model weights to be compatible.\n\n```Shell\npython3 eval.py --config=yolact_edge_config --trained_model=./weights/yolact_base_54_800000.pth\n```\n\n\n#### Inference without Calibration\n\nIf you want to run inference command without calibration, you can either run with FP16-only TensorRT optimization, or without TensorRT optimization with corresponding configs. Refer to `data/config.py` for examples of such configs.\n\n```Shell\n# Evaluate YolactEdge with FP16-only TensorRT optimization with '--use_fp16_tensorrt' option (replace all INT8 optimization with FP16).\npython3 eval.py --use_fp16_tensorrt --trained_model=./weights/yolact_edge_54_800000.pth\n\n# Evaluate YolactEdge without TensorRT optimization with '--disable_tensorrt' option.\npython3 eval.py --disable_tensorrt --trained_model=./weights/yolact_edge_54_800000.pth\n```\n\n### Images\n```Shell\n# Display qualitative results on the specified image.\npython eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --image=my_image.png\n\n# Process an image and save it to another file.\npython eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --image=input_image.png:output_image.png\n\n# Process a whole folder of images.\npython eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --images=path/to/input/folder:path/to/output/folder\n```\n### Video\n```Shell\n# Display a video in real-time. \"--video_multiframe\" will process that many frames at once for improved performance.\n# If video_multiframe \u003e 1, then the trt_batch_size should be increased to match it or surpass it. \npython eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --trt_batch_size 2 --video=my_video.mp4\n\n# Display a webcam feed in real-time. If you have multiple webcams pass the index of the webcam you want instead of 0.\npython eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --trt_batch_size 2 --video=0\n\n# Process a video and save it to another file. This is unoptimized.\npython eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video=input_video.mp4:output_video.mp4\n```\nUse the help option to see a description of all available command line arguments:\n```Shell\npython eval.py --help\n```\n### Programmatic inference\n\nYou can use yolact_edge as a package in your own code. There are two steps to make this work:\n 1) Install YOLACT edge as python package: ```pip install .```\n 2) Use it as in the example provided in ```pkg_usage.py```\n\n## Training\nMake sure to download the entire dataset using the commands above.\n - To train, grab an imagenet-pretrained model and put it in `./weights`.\n   - For Resnet101, download `resnet101_reducedfc.pth` from [here](https://drive.google.com/file/d/1tvqFPd4bJtakOlmn-uIA492g2qurRChj/view?usp=sharing).\n   - For Resnet50, download `resnet50-19c8e357.pth` from [here](https://drive.google.com/file/d/1Jy3yCdbatgXa5YYIdTCRrSV0S9V5g1rn/view?usp=sharing).\n   - For MobileNetV2, download `mobilenet_v2-b0353104.pth` from [here](https://drive.google.com/file/d/1F8YAAWITIkZ_w-fVeetmQKMkfGYfHvUM/view?usp=sharing).\n - Run one of the training commands below.\n   - Note that you can press ctrl+c while training and it will save an `*_interrupt.pth` file at the current iteration.\n   - All weights are saved in the `./weights` directory by default with the file name `\u003cconfig\u003e_\u003cepoch\u003e_\u003citer\u003e.pth`.\n```Shell\n# Trains using the base edge config with a batch size of 8 (the default).\npython train.py --config=yolact_edge_config\n\n# Resume training yolact_edge with a specific weight file and start from the iteration specified in the weight file's name.\npython train.py --config=yolact_edge_config --resume=weights/yolact_edge_10_32100.pth --start_iter=-1\n\n# Use the help option to see a description of all available command line arguments\npython train.py --help\n```\n\n### Training on video dataset\n```Shell\n# Pre-train the image based model\npython train.py --config=yolact_edge_youtubevis_config\n\n# Train the flow (warping) module\npython train.py --config=yolact_edge_vid_trainflow_config --resume=./weights/yolact_edge_youtubevis_847_50000.pth\n\n# Fine tune the network jointly\npython train.py --config=yolact_edge_vid_config --resume=./weights/yolact_edge_vid_trainflow_144_100000.pth\n```\n\n\n### Custom Datasets\nYou can also train on your own dataset by following these steps:\n - Depending on the type of your dataset, create a COCO-style (image) or YTVIS-style (video) Object Detection JSON annotation file for your dataset. The specification for this can be found here for [COCO](http://cocodataset.org/#format-data) and [YTVIS](https://github.com/youtubevos/cocoapi) respectively. Note that we don't use some fields, so the following may be omitted:\n   - `info`\n   - `liscense`\n   - Under `image`: `license, flickr_url, coco_url, date_captured`\n   - `categories` (we use our own format for categories, see below)\n - Create a definition for your dataset under `dataset_base` in `data/config.py` (see the comments in `dataset_base` for an explanation of each field):\n```Python\nmy_custom_dataset = dataset_base.copy({\n    'name': 'My Dataset',\n\n    'train_images': 'path_to_training_images',\n    'train_info':   'path_to_training_annotation',\n\n    'valid_images': 'path_to_validation_images',\n    'valid_info':   'path_to_validation_annotation',\n\n    'has_gt': True,\n    'class_names': ('my_class_id_1', 'my_class_id_2', 'my_class_id_3', ...),\n\n    # below is only needed for YTVIS-style video dataset.\n\n    # whether samples all frames or key frames only.\n    'use_all_frames': False,\n\n    # the following four lines define the frame sampling strategy for the given dataset.\n    'frame_offset_lb': 1,\n    'frame_offset_ub': 4,\n    'frame_offset_multiplier': 1,\n    'all_frame_direction': 'allway',\n\n    # 1 of K frames is annotated\n    'images_per_video': 5,\n\n    # declares a video dataset\n    'is_video': True\n})\n```\n - Note that: class IDs in the annotation file should start at 1 and increase sequentially on the order of `class_names`. If this isn't the case for your annotation file (like in COCO), see the field `label_map` in `dataset_base`.\n - Finally, in `yolact_edge_config` in the same file, change the value for `'dataset'` to `'my_custom_dataset'` or whatever you named the config object above and `'num_classes'` to number of classes in your dataset+1. Then you can use any of the training commands in the previous section.\n \n\n## Citation\n\nIf you use this code base in your work, please consider citing:\n\n```\n@inproceedings{yolactedge-icra2021,\n  author    = {Haotian Liu and Rafael A. Rivera Soto and Fanyi Xiao and Yong Jae Lee},\n  title     = {YolactEdge: Real-time Instance Segmentation on the Edge},\n  booktitle = {ICRA},\n  year      = {2021},\n}\n```\n```\n@inproceedings{yolact-iccv2019,\n  author    = {Daniel Bolya and Chong Zhou and Fanyi Xiao and Yong Jae Lee},\n  title     = {YOLACT: {Real-time} Instance Segmentation},\n  booktitle = {ICCV},\n  year      = {2019},\n}\n```\n\n## Contact\nFor questions about our paper or code, please contact [Haotian Liu](mailto:lhtliu@ucdavis.edu) or [Rafael A. Rivera-Soto](mailto:riverasoto@ucdavis.edu).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwisconsinaivision%2Fyolact_edge","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwisconsinaivision%2Fyolact_edge","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwisconsinaivision%2Fyolact_edge/lists"}