{"id":17303994,"url":"https://github.com/amirbar/detreg","last_synced_at":"2025-04-06T19:12:48.275Z","repository":{"id":39514371,"uuid":"375300783","full_name":"amirbar/DETReg","owner":"amirbar","description":"Official implementation of the CVPR 2022 paper \"DETReg: Unsupervised Pretraining with Region Priors for Object Detection\".","archived":false,"fork":false,"pushed_at":"2023-07-18T02:52:35.000Z","size":803,"stargazers_count":334,"open_issues_count":9,"forks_count":45,"subscribers_count":13,"default_branch":"main","last_synced_at":"2025-03-30T17:11:12.745Z","etag":null,"topics":["deep-learning","object-detection","pytorch","unsupervised-learning"],"latest_commit_sha":null,"homepage":"https://amirbar.net/detreg","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amirbar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-06-09T09:34:55.000Z","updated_at":"2025-01-09T23:56:02.000Z","dependencies_parsed_at":"2024-10-31T02:12:01.752Z","dependency_job_id":null,"html_url":"https://github.com/amirbar/DETReg","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amirbar%2FDETReg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amirbar%2FDETReg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amirbar%2FDETReg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amirbar%2FDETReg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amirbar","download_url":"https://codeload.github.com/amirbar/DETReg/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247535519,"owners_count":20954576,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","object-detection","pytorch","unsupervised-learning"],"created_at":"2024-10-15T11:51:53.536Z","updated_at":"2025-04-06T19:12:48.249Z","avatar_url":"https://github.com/amirbar.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DETReg: Unsupervised Pretraining with Region Priors for Object Detection (CVPR 2022)\n### [Amir Bar](https://amirbar.net), [Xin Wang](https://xinw.ai/), [Vadim Kantorov](http://vadimkantorov.com/), [Colorado J Reed](https://people.eecs.berkeley.edu/~cjrd/), [Roei Herzig](https://roeiherz.github.io/), [Gal Chechik](https://chechiklab.biu.ac.il/), [Anna Rohrbach](https://anna-rohrbach.net/), [Trevor Darrell](https://people.eecs.berkeley.edu/~trevor/), [Amir Globerson](http://www.cs.tau.ac.il/~gamir/)\n![DETReg](./figs/illustration.png)\n  \n\nThis repository is the implementation of DETReg, see [Project Page](https://amirbar.net/detreg).\n\n## Introduction\n\nRecent self-supervised pretraining methods for object detection largely focus on pretraining the backbone of the object detector, neglecting key parts of detection architecture. Instead, we introduce DETReg, a new self-supervised method that pretrains the entire object detection network, including the object localization and embedding components. During pretraining, DETReg predicts object localizations to match the localizations from an unsupervised region proposal generator and simultaneously aligns the corresponding feature embeddings with embeddings from a self-supervised image encoder. We implement DETReg using the DETR family of detectors and show that it improves over competitive baselines when finetuned on COCO, PASCAL VOC, and Airbus Ship benchmarks. In low-data regimes, including semi-supervised and few-shot learning settings, DETReg establishes many state-of-the-art results, e.g., on COCO we see a +6.0 AP improvement for 10-shot detection and +3.5 AP improvement when training with only 1% of the labels.\n\n## Demo\n\nInteract with the DETReg pretrained model in a [Google Colab](https://colab.research.google.com/drive/1ByFXJClyzNVelS7YdT53_bMbwYeMoeNb?usp=sharing)! \n\n## Installation\n\n### Requirements\n\n* Linux, CUDA\u003e=9.2, GCC\u003e=5.4\n  \n* Python\u003e=3.7\n\n    We recommend you to use Anaconda to create a conda environment:\n    ```bash\n    conda create -n detreg python=3.7 pip\n    ```\n    Then, activate the environment:\n    ```bash\n    conda activate detreg\n    ```\n    Installation: (change cudatoolkit to your cuda version. For detailed pytorch installation instructions click [here](https://pytorch.org/))\n    ```bash\n    conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch\n    ```\n  \n* Other requirements\n    ```bash\n    pip install -r requirements.txt\n    ```\n\n### Compiling CUDA operators\n```bash\ncd ./models/ops\nsh ./make.sh\n# unit test (should see all checking is True)\npython test.py\n```\n\n## Usage\n\n### Dataset preparation\n\n#### ImageNet/ImageNet100\nDownload [ImageNet](https://image-net.org/challenges/LSVRC/2012/) and organize it in the following structure:\n\n```\ncode_root/\n└── data/\n    └── ilsvrc/\n          ├── train/\n          └── val/\n```\nNote that in this work we also used the ImageNet100 dataset, which is x10 smaller than ImageNet. To create ImageNet100 run the following command:\n```bash\nmkdir -p data/ilsvrc100/train\nmkdir -p data/ilsvrc100/val\ncode_root=/path/to/code_root\nwhile read line; do ln -s \"${code_root}/data/ilsvrc/train/$line\" ${code_root}/data/ilsvrc100/train/$line\"; done \u003c \"${code_root}/datasets/category.txt\"\nwhile read line; do ln -s \"${code_root}/data/ilsvrc/val/$line\" \"${code_root}/data/ilsvrc100/val/$line\"; done \u003c \"${code_root\u003e/datasets/category.txt\"\n```\n\nThis should results with the following structure:\n```\ncode_root/\n└── data/\n    ├── ilsvrc/\n          ├── train/\n          └── val/\n    └── ilsvrc100/\n          ├── train/\n          └── val/\n```\n\n#### MSCoco\nPlease download [COCO 2017 dataset](https://cocodataset.org/) and organize it in the following structure:\n\n```\ncode_root/\n└── data/\n    └── MSCoco/\n        ├── train2017/\n        ├── val2017/\n        └── annotations/\n        \t├── instances_train2017.json\n        \t└── instances_val2017.json\n```\n#### Pascal VOC\nDownload [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) dataset (2012trainval, 2007trainval, and 2007test):\n```bash\nmkdir -p data/pascal\ncd data/pascal\nwget http://host.robots.ox.ac.uk:8080/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar\nwget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar\nwget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar\ntar -xvf VOCtrainval_11-May-2012.tar\ntar -xvf VOCtrainval_06-Nov-2007.tar\ntar -xvf VOCtest_06-Nov-2007.tar\n```\nThe files should be organized in the following structure:\n```\ncode_root/\n└── data/\n    └── pascal/\n        └── VOCdevkit/\n        \t├── VOC2007\n        \t└── VOC2012\n```\n\n### Pretraining on ImageNet\n\nThe command for pretraining DETReg, based on Deformable-DETR, on 8 GPUs on ImageNet is as follows:\n```bash\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_top30_in.sh --batch_size 24 --num_workers 8\n```\nUsing underlying DETR architecture:\n```bash\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_top30_in_detr.sh --batch_size 24 --num_workers 8\n```\n\nThe command for pretraining DETReg on 8 GPUs on ImageNet100 is as following:\n```bash\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_top30_in100.sh --batch_size 24 --num_workers 8\n```\nTraining takes around 1.5 days with 8 NVIDIA V100 GPUs, you can download a pretrained model (see below) if you want to skip this step.\n\nAfter pretraining, a checkpoint is saved in ```exps/DETReg_top30_in/checkpoint.pth```. To fine tune it over different coco settings use the following commands:\n\n### Pretraining on MSCoco\nThe command for pretraining DETReg on 8 GPUs on MSCoco is as following:\n```bash\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_top30_coco.sh --batch_size 24 --num_workers 8\n```\n\n\n### Finetuning on MSCoco from ImageNet pretraining\n\nFine tuning on full COCO (should take 2 days with 8 NVIDIA V100 GPUs):\n```bash\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_fine_tune_full_coco.sh\n```\n\nThis assumes a checkpoint exists in `exps/DETReg_top30_in/checkpoint.pth`.\n\n### Finetuning on MSCoco low-data regime, from full MSCoco pretraining (Semi-Supervised Learning setting)\n\nFine tuning on 1%\n```bash\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_fine_tune_1pct_coco.sh --batch_size 3\n```\nFine tuning on 2%\n```bash\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_fine_tune_2pct_coco.sh --batch_size 3\n```\nFine tuning on 5%\n```bash\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_fine_tune_5pct_coco.sh --batch_size 3\n```\nFine tuning on 10%\n```bash\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_fine_tune_10pct_coco.sh --batch_size 3\n```\n\n### Finetuning on Pascal VOC\nFine tune on full Pascal:\n```bash\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_fine_tune_full_pascal.sh --batch_size 4 --epochs 100 --lr_drop 70\n```\nFine tune on 10% of Pascal:\n```bash\nGPUS_PER_NODE=2 ./tools/run_dist_launch.sh 2 ./configs/DETReg_fine_tune_10pct_pascal.sh --batch_size 4 --epochs 200 --lr_drop 150\n```\n\n### Few-Shot object detection\n\nFor few-shot, please follow [this](https://github.com/ucbdrive/few-shot-object-detection) code base for the dataloaders, classes, datasplits, etc.\nWe used the few-shot dataset generated with seed = 0.  \n\n#### Using base classes (Table 3)\nFinetune DETReg on base classes (60 classes, 99k labeled images). Similar hyperparams as in MSCoco finetuning.\nThen fine-tune it on few-shot labeled images (80 classes, every class has 10 or 30 instances) for 150 epochs, lr drop after 140 epochs.\n\n#### No base classes (Table 3)\nFine-tune DETReg on few-shot labeled images (80 classes, every class has 10 or 30 instances) for 1000 epochs, lr drop after 990 epochs.\n\n\n### Evaluation\n\nTo evaluate a finetuned model, use the following command from the project basedir:\n\n```bash\n./configs/\u003cconfig file\u003e.sh --resume exps/\u003cconfig file\u003e/checkpoint.pth --eval\n```\n\n### Pretrained Models Zoo\n\n| Model  | Type        | Architecture    | Dataset  | Epochs | Checkpoint                                                                                     |\n|--------|-------------|-----------------|----------|--------|------------------------------------------------------------------------------------------------|\n| DETReg | Pretraining | Deformable DETR | ImageNet | 5      | [link](https://github.com/amirbar/DETReg/releases/download/1.0.0/checkpoint_imagenet.pth)      |\n| DETReg | Pretraining | DETR            | ImageNet | 60     | [link](https://github.com/amirbar/DETReg/releases/download/1.0.0/checkpoint_imagenet_detr.pth) |\n| DETReg | Pretraining | Deformable DETR | MSCoco   | 50     | [link](https://github.com/amirbar/DETReg/releases/download/1.0.0/checkpoint_coco.pth)          |\n| DETReg | Finetuned   | Deformable DETR | MSCoco   | 50     | [link](https://github.com/amirbar/DETReg/releases/download/1.0.0/full_coco_finetune.pth)       |\n| DETReg | 10 Shot (w/ baseclass)   | Deformable DETR | MSCoco   | 150     | [link](https://github.com/amirbar/DETReg/releases/download/1.0.0/detreg_10fs_baseclass.pth)       |\n| DETReg | 30 Shot (w/ baseclass)   | Deformable DETR | MSCoco   | 150     | [link](https://github.com/amirbar/DETReg/releases/download/1.0.0/detreg_30fs_baseclass.pth)       |\n| DETReg | 10 Shot (no baseclass)   | Deformable DETR | MSCoco   | 1000     | [link](https://github.com/amirbar/DETReg/releases/download/1.0.0/detreg_10fs_scratch.pth)       |\n| DETReg | 30 Shot (no baseclass)   | Deformable DETR | MSCoco   | 1000     | [link](https://github.com/amirbar/DETReg/releases/download/1.0.0/detreg_30fs_scratch.pth)       |\n\n\n## Citation\nIf you found this code helpful, feel free to cite our work: \n\n```bibtext\n@misc{bar2021detreg,\n      title={DETReg: Unsupervised Pretraining with Region Priors for Object Detection},\n      author={Amir Bar and Xin Wang and Vadim Kantorov and Colorado J Reed and Roei Herzig and Gal Chechik and Anna Rohrbach and Trevor Darrell and Amir Globerson},\n      year={2021},\n      eprint={2106.04550},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV}\n}\n```\n\n## Related Works\nIf you found DETReg useful, consider checking out these related works as well: [ReSim](https://github.com/Tete-Xiao/ReSim), [SwAV](https://github.com/facebookresearch/swav), [DETR](https://github.com/facebookresearch/detr), [UP-DETR](https://github.com/dddzg/up-detr), and [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR).\n\n## Change Log\n* 07/17/23 - Release DETReg few-shot learning checkpoints pretrained finetuned on baseclass and from scratch (Table 3,4).\n* 07/17/23 - Update few-shot learning results. New paper version.\n* 04/28/22 - Bug fix in multiprocessing, affects Table 5 results. Up-to-date results [here](docs/full-semi-sup.png), new paper version will be uploaded tonight. \n* 12/13/21 - Add DETR architecture\n* 12/12/21 - Update experiments hyperparams in accordance with new paper version\n* 12/12/21 - Avoid box caching on TopK policy (bug fix)\n* 9/19/21 - Fixed Pascal VOC training with %X of training data\n\n\n## Acknowlegments\nDETReg builds on previous works code base such as [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR) and [UP-DETR](https://github.com/dddzg/up-detr). If you found DETReg useful please consider citing these works as well.\n\n## License\nDETReg is released under the Apache 2.0 license. Please see the [LICENSE](https://github.com/amirbar/DETReg/blob/main/LICENSE) file for more information.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famirbar%2Fdetreg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famirbar%2Fdetreg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famirbar%2Fdetreg/lists"}