{"id":19932180,"url":"https://github.com/amazon-science/bigdetection","last_synced_at":"2025-05-16T04:06:09.231Z","repository":{"id":37795337,"uuid":"465210616","full_name":"amazon-science/bigdetection","owner":"amazon-science","description":"BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training","archived":false,"fork":false,"pushed_at":"2024-10-23T17:41:21.000Z","size":8284,"stargazers_count":395,"open_issues_count":5,"forks_count":25,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-04-08T14:11:19.676Z","etag":null,"topics":["computer-vision","few-shot","object-detection","pretraining"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amazon-science.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-03-02T08:01:08.000Z","updated_at":"2025-04-05T08:03:38.000Z","dependencies_parsed_at":"2025-02-02T20:39:22.276Z","dependency_job_id":null,"html_url":"https://github.com/amazon-science/bigdetection","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fbigdetection","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fbigdetection/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fbigdetection/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fbigdetection/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amazon-science","download_url":"https://codeload.github.com/amazon-science/bigdetection/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254464895,"owners_count":22075570,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","few-shot","object-detection","pretraining"],"created_at":"2024-11-12T23:09:18.727Z","updated_at":"2025-05-16T04:06:06.835Z","avatar_url":"https://github.com/amazon-science.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training\n\nBy [Likun Cai](https://github.com/cailk), Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue.\n\n\u003c!-- \u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./resources/bigdetection.png\" height=\"250px\" /\u003e\n\u003c/div\u003e --\u003e\n![](./resources/bigdetection.png)\n\nThis repo is the official implementation of [BigDetection](https://arxiv.org/abs/2203.13249). It is based on [mmdetection](https://github.com/open-mmlab/mmdetection) and [CBNetV2](https://github.com/VDIGPKU/CBNetV2).\n\n## Introduction\nWe construct a new large-scale benchmark termed *BigDetection*. Our goal is to simply leverage the training data from existing datasets ([LVIS](https://www.lvisdataset.org/), [OpenImages](https://storage.googleapis.com/openimages/web/index.html) and [Object365](https://www.objects365.org/overview.html)) with carefully designed principles, and curate a larger dataset for improved detector pre-training. BigDetection dataset has 600 object categories and contains 3.4M training images with 36M object bounding boxes. We show some important statistics of BigDetection in the following figure.\n\n![](./resources/bigdet_statistics.png)\n*Left*: Number of images per category of BigDetection. *Right*: Number of instances in different object sizes. \n\n## Results and Models\n\n### BigDetection Validation\nWe show the evaluation results on BigDetection Validation. We hope BigDetection could serve as a new challenging benchmark for evaluating next-level object detection methods.\n\n| Method | mAP (bigdet val) | Links |\n| --- | :---: | :---: |\n| YOLOv3 | 9.7 | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/mmdetection_cpts/yolov3_d53_bigdet_8x.pth)/[config](configs/BigDetection/yolov3/yolov3_d53_mstrain-608_8x_bigdet.py) |\n| Deformable DETR | 13.1 | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/mmdetection_cpts/deformable_detr_bigdet_8x.pth)/[config](configs/BigDetection/deformable_detr/deformable_detr_r50_16x2_8x_bigdet.py) |\n| Faster R-CNN (C4)\\* | 18.9 | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/detectron2_cpts/faster_rcnn_r50_c4_bigdet_8x.pth) |\n| Faster R-CNN (FPN)\\* | 19.4 | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/detectron2_cpts/faster_rcnn_r50_fpn_bigdet_8x.pth) |\n| CenterNet2\\* | 23.1 | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/detectron2_cpts/centernet2_r50_bigdet_8x.pth) |\n| Cascade R-CNN\\* | 24.1 | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/detectron2_cpts/crcnn_r50_bigdet_8x.pth) |\n| CBNetV2-Swin-Base | 35.1 | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/mmdetection_cpts/htc_cbv2_swin_base_giou_4conv1f_bigdet.pth)/[config](configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_bigdet.py) |\n\n### COCO Validation\nWe show the finetuning performance on COCO minival/test-dev. Results show that BigDetection pre-training provides significant benefits for different detector architectures. We achieve 59.8 mAP on COCO test-dev with a single model.\n\n| Method | mAP (coco minival/test-dev) | Links |\n| --- | :---: | :---: |\n| YOLOv3 | 30.5/- | [config](configs/BigDetection/yolov3/yolov3_d53_mstrain-608_8x_bigdet.py) |\n| Deformable DETR | 39.9/- | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/mmdetection_cpts/deformable_detr_bigdet_coco-ft_1x.pth)/[config](configs/BigDetection/deformable_detr/deformable_detr_r50_16x2_8x_bigdet.py) |\n| Faster R-CNN (C4)\\* | 38.8/- | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/detectron2_cpts/faster_rcnn_r50_c4_bigdet_coco-ft_1x.pth) |\n| Faster R-CNN (FPN)\\* | 40.5/- | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/detectron2_cpts/faster_rcnn_r50_fpn_bigdet_coco-ft_1x.pth) |\n| CenterNet2\\* | 45.3/- | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/detectron2_cpts/centernet2_r50_bigdet_coco-ft_1x.pth) |\n| Cascade R-CNN\\* | 45.1/- | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/detectron2_cpts/crcnn_r50_bigdet_coco-ft_1x.pth) |\n| CBNetV2-Swin-Base | **59.1**/**59.5** | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/mmdetection_cpts/htc_cbv2_swin_base_giou_4conv1f_bigdet_coco-ft_20e.pth)/[config](configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_bigdet.py) |\n| CBNetV2-Swin-Base (TTA) | **59.5**/**59.8** | [config](configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_bigdet.py) |\n\n### Data Efficiency\nWe followed [STAC](https://arxiv.org/abs/2005.04757) and [SoftTeacher](https://arxiv.org/abs/2106.09018) to evaluate on COCO for different partial annotation settings.\n\n| Method | mAP (1%) | mAP (2%) | mAP (5%) | mAP (10%) |\n| --- | :---: | :---: | :---: | :---: |\n| Baseline | 9.8 | 14.3 | 21.2 | 26.2 |\n| STAC     | 14.0 | 18.3 | 24.4 | 28.6 |\n| SoftTeacher (ICCV 21) | 20.5 | 26.5 | 30.7 | 34.0 |\n| Ours | **25.3** | **28.1** | **31.9** | **34.1** |\n|  | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/data_efficiency/faster_rcnn_r50_fpn_bigdet_coco-1.pth) | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/data_efficiency/faster_rcnn_r50_fpn_bigdet_coco-2.pth) | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/data_efficiency/faster_rcnn_r50_fpn_bigdet_coco-5.pth) | [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/data_efficiency/faster_rcnn_r50_fpn_bigdet_coco-10.pth) |\n\n### Notes\n- The models following `*` are implemented on another detection codebase [Detectron2](https://github.com/facebookresearch/detectron2). Here we provide the pretrained checkpoints. The results can be reproduced following the installation of [CenterNet2](https://github.com/xingyizhou/CenterNet2) codebase.\n- Most of models are trained for `8X` schedule on BigDetection.\n- Most of pretrained models are finetuned for `1X` schedule on COCO.\n- `TTA` denotes test time augmentation.\n- Pre-trained models of Swin Transformer can be downloaded from [Swin Transformer for ImageNet Classification](https://github.com/microsoft/Swin-Transformer).\n\n## Getting Started\n\n### Requirements\n- `Ubuntu 16.04`\n- `CUDA 10.2`\n\n### Installation\n```\n# Create conda environment\nconda create -n bigdet python=3.7 -y\nconda activate bigdet\n\n# Install Pytorch\nconda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=10.2 -c pytorch\n\n# Install mmcv\npip install mmcv-full==1.3.9 -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html\n\n# Clone and install\ngit clone https://github.com/amazon-research/bigdetection.git\ncd bigdetection\npip install -r requirements/build.txt\npip install -v -e .\n\n# Install Apex (optinal)\ngit clone https://github.com/NVIDIA/apex\ncd apex\npip install -v --disable-pip-version-check --no-cache-dir --global-option=\"--cpp_ext\" --global-option=\"--cuda_ext\" ./\n```\n\n### Data Preparation\nOur BigDetection involves 3 datasets and train/val data can be downloaded from their official website ([Objects365](https://www.objects365.org/download.html), [OpenImages v6](https://storage.googleapis.com/openimages/web/download.html), [LVIS v1.0](https://www.lvisdataset.org/dataset)). All datasets should be placed under $bigdetection/data/ as below. The synsets (total 600 class names) of BigDetection dataset can be downloaded here: [bigdetection_synsets](https://drive.google.com/file/d/1XbzMia6NYmacIX60oU9h2xE99IkSI24F/view?usp=sharing). Contact us with [lkcai20@fudan.edu.cn](lkcai20@fudan.edu.cn) to get access to our pre-processed annotation files.\n```\nbigdetection/data\n└── BigDetection\n    ├── annotations\n    │   ├── bigdet_obj_train.json\n    │   ├── bigdet_oid_train.json\n    │   ├── bigdet_lvis_train.json\n    │   ├── bigdet_val.json\n    │   └── cas_weights.json\n    ├── train\n    │   ├── Objects365\n    │   ├── OpenImages\n    │   └── LVIS\n    └── val\n```\n\n## Training\n\nTo train a detector with pre-trained models, run:\n```\n# multi-gpu training\ntools/dist_train.sh \u003cCONFIG_FILE\u003e \u003cGPU_NUM\u003e --cfg-options load_from=\u003cPRETRAIN_MODEL\u003e\n```\n\n***Pre-training***\n\nTo pre-train a CBNetV2 with a Swin-Base backbone on BigDetection using 8 GPUs, run: (`PRETRAIN_MODEL` should be pre-trained checkpoint of Base-Swin-Transformer: [model](https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_base_patch4_window7_224_22k.pth))\n```\ntools/dist_train.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_bigdet.py 8 \\\n    --cfg-options load_from=\u003cPRETRAIN_MODEL\u003e\n```\nTo pre-train a Deformable-DETR with a ResNet-50 backbone on BigDetection, run:\n```\ntools/dist_train.sh configs/BigDetection/deformable_detr/deformable_detr_r50_16x2_8x_bigdet.py 8\n```\n\n***Fine-tuning***\n\nTo fine-tune a BigDetection pre-trained CBNetV2 (with Swin-Base backbone) on COCO, run: (`PRETRAIN_MODEL` should be BigDetection pre-trained checkpoint of CBNetV2: [model](https://big-detection.s3.us-west-2.amazonaws.com/bigdet_cpts/mmdetection_cpts/htc_cbv2_swin_base_giou_4conv1f_bigdet.pth))\n```\ntools/dist_train.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_20e_coco.py 8 \\\n    --cfg-options load_from=\u003cPRETRAIN_MODEL\u003e\n```\n\n## Inference\nTo evaluate a detector with pre-trained checkpoints, run:\n```\ntools/dist_test.sh \u003cCONFIG_FILE\u003e \u003cCHECKPOINT\u003e \u003cGPU_NUM\u003e --eval bbox\n```\n\n***BigDetection evaluation***\n\nTo evaluate pre-trained CBNetV2 on BigDetection validation, run:\n```\ntools/dist_test.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_bigdet.py \\\n    \u003cBIGDET_PRETRAIN_CHECKPOINT\u003e 8 --eval bbox\n```\n\n***COCO evaluation***\n\nTo evaluate COCO-finetuned CBNetV2 on COCO validation, run:\n```\n# without test-time-augmentation\ntools/dist_test.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_20e_coco.py \\\n    \u003cCOCO_FINETUNE_CHECKPOINT\u003e 8 --eval bbox mask\n\n# with test-time-augmentation\ntools/dist_test.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_20e_coco_tta.py \\\n    \u003cCOCO_FINETUNE_CHECKPOINT\u003e 8 --eval bbox mask\n```\n\nOther configuration based on Detectron2 can be found at [detectron2-probject](detectron2-projects/CenterNet2/README.md).\n\n## Citation\n\nIf you use our dataset or pretrained models in your research, please kindly consider to cite the following paper.\n```\n@article{bigdetection2022,\n  title={BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training},\n  author={Likun Cai and Zhi Zhang and Yi Zhu and Li Zhang and Mu Li and Xiangyang Xue},\n  journal={arXiv preprint arXiv:2203.13249},\n  year={2022}\n}\n```\n\n## Security\n\nSee [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.\n\n\n## License\n\nThis project is licensed under the Apache-2.0 License.\n\n\n## Acknowledgement\n\nWe thank the authors releasing [mmdetection](https://github.com/open-mmlab/mmdetection) and [CBNetv2](https://github.com/VDIGPKU/CBNetV2) for object detection research community.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famazon-science%2Fbigdetection","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famazon-science%2Fbigdetection","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famazon-science%2Fbigdetection/lists"}