{"id":13935862,"url":"https://github.com/feiyuhuahuo/Yolact_minimal","last_synced_at":"2025-07-19T21:30:50.839Z","repository":{"id":40618978,"uuid":"206063080","full_name":"feiyuhuahuo/Yolact_minimal","owner":"feiyuhuahuo","description":"Minimal PyTorch implementation of YOLACT.","archived":false,"fork":false,"pushed_at":"2021-11-17T01:07:00.000Z","size":2653,"stargazers_count":237,"open_issues_count":13,"forks_count":70,"subscribers_count":11,"default_branch":"master","last_synced_at":"2024-11-27T03:34:55.240Z","etag":null,"topics":["instance-segmentation","pytorch","real-time"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/feiyuhuahuo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-09-03T11:39:11.000Z","updated_at":"2024-09-23T09:02:33.000Z","dependencies_parsed_at":"2022-07-31T23:48:09.028Z","dependency_job_id":null,"html_url":"https://github.com/feiyuhuahuo/Yolact_minimal","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/feiyuhuahuo/Yolact_minimal","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/feiyuhuahuo%2FYolact_minimal","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/feiyuhuahuo%2FYolact_minimal/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/feiyuhuahuo%2FYolact_minimal/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/feiyuhuahuo%2FYolact_minimal/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/feiyuhuahuo","download_url":"https://codeload.github.com/feiyuhuahuo/Yolact_minimal/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/feiyuhuahuo%2FYolact_minimal/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266019657,"owners_count":23864916,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["instance-segmentation","pytorch","real-time"],"created_at":"2024-08-07T23:02:09.460Z","updated_at":"2025-07-19T21:30:50.826Z","avatar_url":"https://github.com/feiyuhuahuo.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"## Yolact_minimal\nMinimal PyTorch implementation of [Yolact:《YOLACT: Real-time Instance Segmentation》](https://arxiv.org/abs/1904.02689).  \nThe original project is [here](https://github.com/dbolya/yolact).  \n\nThis implementation simplified the original code, preserved the main function and made the network easy to understand.  \nThis implementation has not been updated to Yolact++.  \n\n\n### The network structure.  \n![Example 0](readme_imgs/network.png)\n\n## Environments  \nPyTorch \u003e= 1.1  \nPython \u003e= 3.6  \nonnxruntime-gpu == 1.6.0 for CUDA 10.2  \nTensorRT == 7.2.3.4  \ntensorboardX  \nOther common packages.  \n\n## Prepare\n```Shell\n# Build cython-nms \npython setup.py build_ext --inplace\n```\n- Download COCO 2017 datasets, modify `self.data_root` in 'res101_coco' in `config.py`. \n- Download weights.\n\nYolact trained weights.\n\n|Backbone   | box mAP  | mask mAP | number of parameters | Google Drive                                                                                                             |Baidu Cloud                                                       |\n|:---------:|:--------:|:--------:|:--------------------:|:------------------------------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------:|\n|Resnet50   | 31.3     | 28.8     |       31.16 M        |[best_28.8_res50_coco_340000.pth](https://drive.google.com/file/d/1ujjwNLwPpWiPgguSwAhrpevIsmJsPADj/view?usp=sharing)     |[password: uu75](https://pan.baidu.com/s/1tZH4x7wsiirnqjsZCqxENg) |\n|Resnet101  | 33.4     | 30.4     |       50.15 M        |[best_30.4_res101_coco_340000.pth](https://drive.google.com/file/d/1-ytVx7JlKUvFQE2H3um6VYYHke9qpTy_/view?usp=sharing)    |[password: njsk](https://pan.baidu.com/s/1iXNS_ORwR0BzqDI7qX0JvA) |\n|swin_tiny  | 34.3     | 32.1     |       34.58 M        |[best_31.9_swin_tiny_coco_308000.pth](https://drive.google.com/file/d/12-RklMCIJ3nUsfP6veWa4s45_Q3s1tXD/view?usp=sharing) |[password: i8e9](https://pan.baidu.com/s/1laOjozNSwf2-mfFz87N_6A) |\n\nImageNet pre-trained weights.\n\n| Backbone  | Google Drive                                                                                              |Baidu Cloud                                                        |\n|:---------:|:---------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------:|\n| Resnet50  | [backbone_res50.pth](https://drive.google.com/file/d/1Bx_DZbxVOgCgNVsEU3gH5p_rXKKxbVYo/view?usp=sharing)  | [password: juso](https://pan.baidu.com/s/12E5DEvb2zJYqa4T8fm08Og) |\n| Resnet101 | [backbone_res101.pth](https://drive.google.com/file/d/1Q7Cj7j70a3nT6AEmsWXueq4VaaUQ7hwA/view?usp=sharing) | [password: 5wsp](https://pan.baidu.com/s/1ute7NHb2n3iDIiHxDOGwEg) |\n| swin_tiny | [swin-tiny.pth](https://drive.google.com/file/d/1dvoPNGj2SHd5XhmSyE23GYzmimsTQbsU/view?usp=sharing)       | [password: g0o2](https://pan.baidu.com/s/1PTVtiryHXdDvuLcR4zJQFA) |\n\n## Improvement log\n2021.4.19. Use swin_tiny transformer as backbone, +1.0 box mAP, +1.4 mask mAP.  \n2021.1.7. Focal loss did not help, tried conf_alpha 4, 6, 7, 8.  \n2021.1.7. Less training iterations, 800k --\u003e 680k with batch size 8.  \n2020.11.2. Improved data augmentation, use rectangle anchors, training is stable, infinite loss no longer appears.  \n2020.11.2. DDP training, train batch size increased to 16, +0.4 box mAP, +0.7 mask mAP (resnet101).  \n\n## Train\n```Shell\n# Train with resnet101 backbone on one GPU with a batch size of 8 (default).\npython -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --train_bs=8\n# Train on multiple GPUs (i.e. two GPUs, 8 images per GPU).\nexport CUDA_VISIBLE_DEVICES=0,1  # Select the GPU to use.\npython -m torch.distributed.launch --nproc_per_node=2 --master_port=$((RANDOM)) train.py --train_bs=16\n# Train with other configurations (res101_coco, res50_coco, res50_pascal, res101_custom, res50_custom, in total).\npython -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --cfg=res50_coco\n# Train with different batch_size (batch size should not be smaller than 4).\npython -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --train_bs=4\n# Train with different image size (anchor settings related to image size will be adjusted automatically).\npython -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --img_size=400\n# Resume training with a specified model.\npython -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --resume=weights/latest_res101_coco_35000.pth\n# Set evalution interval during training, set -1 to disable it.  \npython -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --val_interval 8000\n# Train on CPU.\npython train.py --train_bs=4\n```\n## Use tensorboard\n```Shell\ntensorboard --logdir=tensorboard_log/res101_coco\n```\n\n## Evalution\n```Shell\n# Select the GPU to use.\nexport CUDA_VISIBLE_DEVICES=0\n```\n\n```Shell\n# Evaluate on COCO val2017 (configuration will be parsed according to the model name).\n# The metric API in this project can not get the exact COCO mAP, but the evaluation speed is fast. \npython eval.py --weight=weights/best_30.4_res101_coco_340000.pth\n# To get the exact COCO mAP:\npython eval.py --weight=weights/best_30.4_res101_coco_340000.pth --coco_api\n# Evaluate with a specified number of images.\npython eval.py --weight=weights/best_30.4_res101_coco_340000.pth --val_num=1000\n# Evaluate with traditional nms.\npython eval.py --weight=weights/best_30.4_res101_coco_340000.pth --traditional_nms\n```\n## Detect\n- detect result  \n![Example 2](readme_imgs/result.jpg)  \n  \n```Shell\n# Select the GPU to use.\nexport CUDA_VISIBLE_DEVICES=0\n```\n\n```Shell\n# To detect images, pass the path of the image folder, detected images will be saved in `results/images`.\npython detect.py --weight=weights/best_30.4_res101_coco_340000.pth --image=images\n```\n- cutout object  \n![Example 3](readme_imgs/cutout.jpg)\n```Shell\n# Use --cutout to cut out detected objects.\npython detect.py --weight=weights/best_30.4_res101_coco_340000.pth --image=images --cutout\n```\n```Shell\n# To detect videos, pass the path of video, detected video will be saved in `results/videos`:\npython detect.py --weight=weights/best_30.4_res101_coco_340000.pth --video=videos/1.mp4\n# Use --real_time to detect real-timely.\npython detect.py --weight=weights/best_30.4_res101_coco_340000.pth --video=videos/1.mp4 --real_time\n```\n- linear combination result  \n![Example 4](readme_imgs/lincomb.jpg)\n\n```Shell\n# Use --hide_mask, --hide_score, --save_lincomb, --no_crop and so on to get different results.\npython detect.py --weight=weights/best_30.4_res101_coco_340000.pth --image=images --save_lincomb\n```\n\n## Transport to ONNX    \n```Shell\npython export2onnx.py --weight='weights/best_30.4_res101_coco_340000.pth' --opset=12\n# Detect with ONNX file, all the options are the same as those in `detect.py`.\npython detect_with_onnx.py --weight='onnx_files/res101_coco.onnx' --image=images.\n```\n\n## Accelerate with TensorRT   \n```Shell\npython export2trt.py --weight='onnx_files/res101_coco.onnx'\n# Detect with TensorRT, all the options are the same as those in `detect.py`.\npython detect_with_trt.py --weight='trt_files/res101_coco.trt' --image=images.\n```\n\n## Train on PASCAL_SBD datasets\n- Download PASCAL_SBD datasets from [here](http://home.bharathh.info/pubs/codes/SBD/download.html), modify the path of the `img` folder in `data/config.py`.  \n```Shell\n# Generate a coco-style json.\npython utils/pascal2coco.py --folder_path=/home/feiyu/Data/pascal_sbd\n# Training.\npython -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --cfg=res50_pascal\n```\n\n## Train custom datasets\n- Install labelme  \n```Shell\npip install labelme\n```\n- Use labelme to label your images, only ploygons are needed. The created json files are in the same folder with the images.  \n![Example 5](readme_imgs/labelme.png)\n- Prepare a 'labels.txt' like this, this first line: 'background' is always needed.  \n![Example 6](readme_imgs/labels.png)\n- Prepare coco-style json, pass the paths of your image folder and the labels.txt. The image type is also needed. The 'custom_dataset' folder is a prepared example.  \n```Shell\npython utils/labelme2coco.py --img_dir=custom_dataset --label_name=cuatom_dataset/labels.txt --img_type=jpg\n```\n- Edit `CUSTOM_CLASSES` in `config.py`.  \n![Example 7](readme_imgs/label_name.png)  \nNote that if there's only one class, the `CUSTOM_CLASSES` should be like `('dog', )`. The final comma is necessary to make it as a tuple, or the number of classes would be `len('dog')`.  \n- Choose a configuration ('res101_custom' or 'res50_custom') in `config.py`, modify the corresponding `self.train_imgs` and `self.train_ann`. If you need to validate, prepare the validation dataset by the same way.  \n- Then train.  \n```Shell\npython -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --cfg=res101_custom\n```\n- Some parameters need to be taken care of by yourself:\n1) Training batch size, try not to use batch size smaller than 4.\n2) Anchor size, the anchor size should match with the object scale of your dataset.\n3) Total training steps, learning rate decay steps and the warm up step, these should be decided according to the dataset size, overwrite `self.lr_steps`, `self.warmup_until` in your configuration.\n\n## Citation\n```\n@inproceedings{yolact-iccv2019,\n  author    = {Daniel Bolya and Chong Zhou and Fanyi Xiao and Yong Jae Lee},\n  title     = {YOLACT: {Real-time} Instance Segmentation},\n  booktitle = {ICCV},\n  year      = {2019},\n}\n```\n```\n@article{liu2021Swin,\n  title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows},\n  author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining},\n  journal={arXiv preprint arXiv:2103.14030},\n  year={2021}\n}\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffeiyuhuahuo%2FYolact_minimal","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffeiyuhuahuo%2FYolact_minimal","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffeiyuhuahuo%2FYolact_minimal/lists"}