{"id":13445553,"url":"https://github.com/facebookresearch/maskrcnn-benchmark","last_synced_at":"2025-09-27T06:32:38.738Z","repository":{"id":39351652,"uuid":"154542095","full_name":"facebookresearch/maskrcnn-benchmark","owner":"facebookresearch","description":"Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.","archived":true,"fork":false,"pushed_at":"2023-02-16T04:01:32.000Z","size":6842,"stargazers_count":9316,"open_issues_count":530,"forks_count":2495,"subscribers_count":177,"default_branch":"main","last_synced_at":"2024-12-17T01:37:41.030Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/facebookresearch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-10-24T17:34:50.000Z","updated_at":"2024-12-15T07:19:54.000Z","dependencies_parsed_at":"2022-07-04T07:17:51.775Z","dependency_job_id":"0a46b85b-4312-4127-9931-c29cfcb4c932","html_url":"https://github.com/facebookresearch/maskrcnn-benchmark","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Fmaskrcnn-benchmark","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Fmaskrcnn-benchmark/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Fmaskrcnn-benchmark/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Fmaskrcnn-benchmark/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/facebookresearch","download_url":"https://codeload.github.com/facebookresearch/maskrcnn-benchmark/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234402284,"owners_count":18826726,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T05:00:35.895Z","updated_at":"2025-09-27T06:32:33.039Z","avatar_url":"https://github.com/facebookresearch.png","language":"Python","funding_links":[],"categories":["Other","Introduction","Opensource Projects","Pytorch \u0026 related libraries｜Pytorch \u0026 相关库","Pytorch \u0026 related libraries","Python","References","Model Deployment library","Deep Learning Projects"],"sub_categories":["CV｜计算机视觉:","CV:","Demo","PyTorch \u003ca name=\"pytorch\"/\u003e"],"readme":"# Faster R-CNN and Mask R-CNN in PyTorch 1.0\n\n**maskrcnn-benchmark has been deprecated. Please see [detectron2](https://github.com/facebookresearch/detectron2), which includes implementations for all models in maskrcnn-benchmark**\n\nThis project aims at providing the necessary building blocks for easily\ncreating detection and segmentation models using PyTorch 1.0.\n\n![alt text](demo/demo_e2e_mask_rcnn_X_101_32x8d_FPN_1x.png \"from http://cocodataset.org/#explore?id=345434\")\n\n## Highlights\n- **PyTorch 1.0:** RPN, Faster R-CNN and Mask R-CNN implementations that matches or exceeds Detectron accuracies\n- **Very fast**: up to **2x** faster than [Detectron](https://github.com/facebookresearch/Detectron) and **30%** faster than [mmdetection](https://github.com/open-mmlab/mmdetection) during training. See [MODEL_ZOO.md](MODEL_ZOO.md) for more details.\n- **Memory efficient:** uses roughly 500MB less GPU memory than mmdetection during training\n- **Multi-GPU training and inference**\n- **Mixed precision training:** trains faster with less GPU memory on [NVIDIA tensor cores](https://developer.nvidia.com/tensor-cores).\n- **Batched inference:** can perform inference using multiple images per batch per GPU\n- **CPU support for inference:** runs on CPU in inference time. See our [webcam demo](demo) for an example\n- Provides pre-trained models for almost all reference Mask R-CNN and Faster R-CNN configurations with 1x schedule.\n\n## Webcam and Jupyter notebook demo\n\nWe provide a simple webcam demo that illustrates how you can use `maskrcnn_benchmark` for inference:\n```bash\ncd demo\n# by default, it runs on the GPU\n# for best results, use min-image-size 800\npython webcam.py --min-image-size 800\n# can also run it on the CPU\npython webcam.py --min-image-size 300 MODEL.DEVICE cpu\n# or change the model that you want to use\npython webcam.py --config-file ../configs/caffe2/e2e_mask_rcnn_R_101_FPN_1x_caffe2.yaml --min-image-size 300 MODEL.DEVICE cpu\n# in order to see the probability heatmaps, pass --show-mask-heatmaps\npython webcam.py --min-image-size 300 --show-mask-heatmaps MODEL.DEVICE cpu\n# for the keypoint demo\npython webcam.py --config-file ../configs/caffe2/e2e_keypoint_rcnn_R_50_FPN_1x_caffe2.yaml --min-image-size 300 MODEL.DEVICE cpu\n```\n\nA notebook with the demo can be found in [demo/Mask_R-CNN_demo.ipynb](demo/Mask_R-CNN_demo.ipynb).\n\n## Installation\n\nCheck [INSTALL.md](INSTALL.md) for installation instructions.\n\n\n## Model Zoo and Baselines\n\nPre-trained models, baselines and comparison with Detectron and mmdetection\ncan be found in [MODEL_ZOO.md](MODEL_ZOO.md)\n\n## Inference in a few lines\nWe provide a helper class to simplify writing inference pipelines using pre-trained models.\nHere is how we would do it. Run this from the `demo` folder:\n```python\nfrom maskrcnn_benchmark.config import cfg\nfrom predictor import COCODemo\n\nconfig_file = \"../configs/caffe2/e2e_mask_rcnn_R_50_FPN_1x_caffe2.yaml\"\n\n# update the config options with the config file\ncfg.merge_from_file(config_file)\n# manual override some options\ncfg.merge_from_list([\"MODEL.DEVICE\", \"cpu\"])\n\ncoco_demo = COCODemo(\n    cfg,\n    min_image_size=800,\n    confidence_threshold=0.7,\n)\n# load image and then run prediction\nimage = ...\npredictions = coco_demo.run_on_opencv_image(image)\n```\n\n## Perform training on COCO dataset\n\nFor the following examples to work, you need to first install `maskrcnn_benchmark`.\n\nYou will also need to download the COCO dataset.\nWe recommend to symlink the path to the coco dataset to `datasets/` as follows\n\nWe use `minival` and `valminusminival` sets from [Detectron](https://github.com/facebookresearch/Detectron/blob/master/detectron/datasets/data/README.md#coco-minival-annotations)\n\n```bash\n# symlink the coco dataset\ncd ~/github/maskrcnn-benchmark\nmkdir -p datasets/coco\nln -s /path_to_coco_dataset/annotations datasets/coco/annotations\nln -s /path_to_coco_dataset/train2014 datasets/coco/train2014\nln -s /path_to_coco_dataset/test2014 datasets/coco/test2014\nln -s /path_to_coco_dataset/val2014 datasets/coco/val2014\n# or use COCO 2017 version\nln -s /path_to_coco_dataset/annotations datasets/coco/annotations\nln -s /path_to_coco_dataset/train2017 datasets/coco/train2017\nln -s /path_to_coco_dataset/test2017 datasets/coco/test2017\nln -s /path_to_coco_dataset/val2017 datasets/coco/val2017\n\n# for pascal voc dataset:\nln -s /path_to_VOCdevkit_dir datasets/voc\n```\n\nP.S. `COCO_2017_train` = `COCO_2014_train` + `valminusminival` , `COCO_2017_val` = `minival`\n      \n\nYou can also configure your own paths to the datasets.\nFor that, all you need to do is to modify `maskrcnn_benchmark/config/paths_catalog.py` to\npoint to the location where your dataset is stored.\nYou can also create a new `paths_catalog.py` file which implements the same two classes,\nand pass it as a config argument `PATHS_CATALOG` during training.\n\n### Single GPU training\n\nMost of the configuration files that we provide assume that we are running on 8 GPUs.\nIn order to be able to run it on fewer GPUs, there are a few possibilities:\n\n**1. Run the following without modifications**\n\n```bash\npython /path_to_maskrcnn_benchmark/tools/train_net.py --config-file \"/path/to/config/file.yaml\"\n```\nThis should work out of the box and is very similar to what we should do for multi-GPU training.\nBut the drawback is that it will use much more GPU memory. The reason is that we set in the\nconfiguration files a global batch size that is divided over the number of GPUs. So if we only\nhave a single GPU, this means that the batch size for that GPU will be 8x larger, which might lead\nto out-of-memory errors.\n\nIf you have a lot of memory available, this is the easiest solution.\n\n**2. Modify the cfg parameters**\n\nIf you experience out-of-memory errors, you can reduce the global batch size. But this means that\nyou'll also need to change the learning rate, the number of iterations and the learning rate schedule.\n\nHere is an example for Mask R-CNN R-50 FPN with the 1x schedule:\n```bash\npython tools/train_net.py --config-file \"configs/e2e_mask_rcnn_R_50_FPN_1x.yaml\" SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025 SOLVER.MAX_ITER 720000 SOLVER.STEPS \"(480000, 640000)\" TEST.IMS_PER_BATCH 1 MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 2000\n```\nThis follows the [scheduling rules from Detectron.](https://github.com/facebookresearch/Detectron/blob/master/configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml#L14-L30)\nNote that we have multiplied the number of iterations by 8x (as well as the learning rate schedules),\nand we have divided the learning rate by 8x.\n\nWe also changed the batch size during testing, but that is generally not necessary because testing\nrequires much less memory than training.\n\nFurthermore, we set `MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 2000` as the proposals are selected for per the batch rather than per image in the default training. The value is calculated by **1000 x images-per-gpu**. Here we have 2 images per GPU, therefore we set the number as 1000 x 2 = 2000. If we have 8 images per GPU, the value should be set as 8000. Note that this does not apply if `MODEL.RPN.FPN_POST_NMS_PER_BATCH` is set to `False` during training. See [#672](https://github.com/facebookresearch/maskrcnn-benchmark/issues/672) for more details.\n\n### Multi-GPU training\nWe use internally `torch.distributed.launch` in order to launch\nmulti-gpu training. This utility function from PyTorch spawns as many\nPython processes as the number of GPUs we want to use, and each Python\nprocess will only use a single GPU.\n\n```bash\nexport NGPUS=8\npython -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/train_net.py --config-file \"path/to/config/file.yaml\" MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN images_per_gpu x 1000\n```\nNote we should set `MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN` follow the rule in Single-GPU training.\n\n### Mixed precision training\nWe currently use [APEX](https://github.com/NVIDIA/apex) to add [Automatic Mixed Precision](https://developer.nvidia.com/automatic-mixed-precision) support. To enable, just do Single-GPU or Multi-GPU training and set `DTYPE \"float16\"`.\n\n```bash\nexport NGPUS=8\npython -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/train_net.py --config-file \"path/to/config/file.yaml\" MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN images_per_gpu x 1000 DTYPE \"float16\"\n```\nIf you want more verbose logging, set `AMP_VERBOSE True`. See [Mixed Precision Training guide](https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html) for more details.\n\n## Evaluation\nYou can test your model directly on single or multiple gpus. Here is an example for Mask R-CNN R-50 FPN with the 1x schedule on 8 GPUS:\n```bash\nexport NGPUS=8\npython -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/test_net.py --config-file \"configs/e2e_mask_rcnn_R_50_FPN_1x.yaml\" TEST.IMS_PER_BATCH 16\n```\nTo calculate mAP for each class, you can simply modify a few lines in [coco_eval.py](https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/maskrcnn_benchmark/data/datasets/evaluation/coco/coco_eval.py). See [#524](https://github.com/facebookresearch/maskrcnn-benchmark/issues/524#issuecomment-475118810) for more details.\n\n## Abstractions\nFor more information on some of the main abstractions in our implementation, see [ABSTRACTIONS.md](ABSTRACTIONS.md).\n\n## Adding your own dataset\n\nThis implementation adds support for COCO-style datasets.\nBut adding support for training on a new dataset can be done as follows:\n```python\nfrom maskrcnn_benchmark.structures.bounding_box import BoxList\n\nclass MyDataset(object):\n    def __init__(self, ...):\n        # as you would do normally\n\n    def __getitem__(self, idx):\n        # load the image as a PIL Image\n        image = ...\n\n        # load the bounding boxes as a list of list of boxes\n        # in this case, for illustrative purposes, we use\n        # x1, y1, x2, y2 order.\n        boxes = [[0, 0, 10, 10], [10, 20, 50, 50]]\n        # and labels\n        labels = torch.tensor([10, 20])\n\n        # create a BoxList from the boxes\n        boxlist = BoxList(boxes, image.size, mode=\"xyxy\")\n        # add the labels to the boxlist\n        boxlist.add_field(\"labels\", labels)\n\n        if self.transforms:\n            image, boxlist = self.transforms(image, boxlist)\n\n        # return the image, the boxlist and the idx in your dataset\n        return image, boxlist, idx\n\n    def get_img_info(self, idx):\n        # get img_height and img_width. This is used if\n        # we want to split the batches according to the aspect ratio\n        # of the image, as it can be more efficient than loading the\n        # image from disk\n        return {\"height\": img_height, \"width\": img_width}\n```\nThat's it. You can also add extra fields to the boxlist, such as segmentation masks\n(using `structures.segmentation_mask.SegmentationMask`), or even your own instance type.\n\nFor a full example of how the `COCODataset` is implemented, check [`maskrcnn_benchmark/data/datasets/coco.py`](maskrcnn_benchmark/data/datasets/coco.py).\n\nOnce you have created your dataset, it needs to be added in a couple of places:\n- [`maskrcnn_benchmark/data/datasets/__init__.py`](maskrcnn_benchmark/data/datasets/__init__.py): add it to `__all__`\n- [`maskrcnn_benchmark/config/paths_catalog.py`](maskrcnn_benchmark/config/paths_catalog.py): `DatasetCatalog.DATASETS` and corresponding `if` clause in `DatasetCatalog.get()`\n\n### Testing\nWhile the aforementioned example should work for training, we leverage the\ncocoApi for computing the accuracies during testing. Thus, test datasets\nshould currently follow the cocoApi for now.\n\nTo enable your dataset for testing, add a corresponding if statement in [`maskrcnn_benchmark/data/datasets/evaluation/__init__.py`](maskrcnn_benchmark/data/datasets/evaluation/__init__.py):\n```python\nif isinstance(dataset, datasets.MyDataset):\n        return coco_evaluation(**args)\n```\n\n## Finetuning from Detectron weights on custom datasets\nCreate a script `tools/trim_detectron_model.py` like [here](https://gist.github.com/wangg12/aea194aa6ab6a4de088f14ee193fd968).\nYou can decide which keys to be removed and which keys to be kept by modifying the script.\n\nThen you can simply point the converted model path in the config file by changing `MODEL.WEIGHT`.\n\nFor further information, please refer to [#15](https://github.com/facebookresearch/maskrcnn-benchmark/issues/15).\n\n## Troubleshooting\nIf you have issues running or compiling this code, we have compiled a list of common issues in\n[TROUBLESHOOTING.md](TROUBLESHOOTING.md). If your issue is not present there, please feel\nfree to open a new issue.\n\n## Citations\nPlease consider citing this project in your publications if it helps your research. The following is a BibTeX reference. The BibTeX entry requires the `url` LaTeX package.\n```\n@misc{massa2018mrcnn,\nauthor = {Massa, Francisco and Girshick, Ross},\ntitle = {{maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch}},\nyear = {2018},\nhowpublished = {\\url{https://github.com/facebookresearch/maskrcnn-benchmark}},\nnote = {Accessed: [Insert date here]}\n}\n```\n\n## Projects using maskrcnn-benchmark\n\n- [RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free](https://arxiv.org/abs/1901.03353). \n  Cheng-Yang Fu, Mykhailo Shvets, and Alexander C. Berg.\n  Tech report, arXiv,1901.03353.\n- [FCOS: Fully Convolutional One-Stage Object Detection](https://arxiv.org/abs/1904.01355).\n  Zhi Tian, Chunhua Shen, Hao Chen and Tong He.\n  Tech report, arXiv,1904.01355. [[code](https://github.com/tianzhi0549/FCOS)]\n- [MULAN: Multitask Universal Lesion Analysis Network for Joint Lesion Detection, Tagging, and Segmentation](https://arxiv.org/abs/1908.04373).\n  Ke Yan, Youbao Tang, Yifan Peng, Veit Sandfort, Mohammadhadi Bagheri, Zhiyong Lu, and Ronald M. Summers.\n  MICCAI 2019. [[code](https://github.com/rsummers11/CADLab/tree/master/MULAN_universal_lesion_analysis)]\n- [Is Sampling Heuristics Necessary in Training Deep Object Detectors?](https://arxiv.org/abs/1909.04868)\n  Joya Chen, Dong Liu, Tong Xu, Shilong Zhang, Shiwei Wu, Bin Luo, Xuezheng Peng, Enhong Chen.\n  Tech report, arXiv,1909.04868. [[code](https://github.com/ChenJoya/sampling-free)]\n  \n## License\n\nmaskrcnn-benchmark is released under the MIT license. See [LICENSE](LICENSE) for additional details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2Fmaskrcnn-benchmark","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffacebookresearch%2Fmaskrcnn-benchmark","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2Fmaskrcnn-benchmark/lists"}