{"id":20339275,"url":"https://github.com/encounter1997/de-conddetr","last_synced_at":"2025-04-11T23:13:25.387Z","repository":{"id":41372893,"uuid":"472419810","full_name":"encounter1997/DE-CondDETR","owner":"encounter1997","description":"Official Implementation of DE-CondDETR and DELA-CondDETR in \"Towards Data-Efficient Detection Transformers\"","archived":false,"fork":false,"pushed_at":"2022-08-25T02:52:15.000Z","size":290,"stargazers_count":46,"open_issues_count":0,"forks_count":4,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-11T23:13:19.242Z","etag":null,"topics":["data-efficiency","detection-transformer","object-detection"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/encounter1997.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-03-21T16:23:11.000Z","updated_at":"2025-03-18T01:18:45.000Z","dependencies_parsed_at":"2022-08-10T02:06:55.199Z","dependency_job_id":null,"html_url":"https://github.com/encounter1997/DE-CondDETR","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/encounter1997%2FDE-CondDETR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/encounter1997%2FDE-CondDETR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/encounter1997%2FDE-CondDETR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/encounter1997%2FDE-CondDETR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/encounter1997","download_url":"https://codeload.github.com/encounter1997/DE-CondDETR/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248492877,"owners_count":21113163,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-efficiency","detection-transformer","object-detection"],"created_at":"2024-11-14T21:16:08.853Z","updated_at":"2025-04-11T23:13:25.363Z","avatar_url":"https://github.com/encounter1997.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DE-DETRs\n\nBy Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, and Dacheng Tao\n\nThis repository is an official implementation of DE-CondDETR and DELA-CondDETR in the paper [Towards Data-Efficient Detection Transformers](https://arxiv.org/abs/2203.09507), which is accepted to ECCV 2022.\n\nFor the implementation of DE-DETR and DELA-DETR, please refer to [DE-DETRs](https://github.com/encounter1997/DE-DETRs).\n\n## Introduction\n\n**TL; DR.**  We identify the data-hungry issue of existing detection transformers and alleviate it by simply alternating how key and value sequences are constructed in the cross-attention layer, with minimum modifications to the original models. Besides, we introduce a simple yet effective label augmentation method to provide richer supervision and improve data efficiency.\n\n![DE-DETR](./figs/de-detr.png)\n\n**Abstract.**  Detection Transformers have achieved competitive performance on the sample-rich COCO dataset. However, we show most of them suffer from significant performance drops on small-size datasets, like Cityscapes. In other words, the detection transformers are generally data-hungry. To tackle this problem, we empirically analyze the factors that affect data efficiency, through a step-by-step transition from a data-efficient RCNN variant to the representative DETR. The empirical results suggest that sparse feature sampling from local image areas holds the key. Based on this observation, we alleviate the data-hungry issue of existing detection transformers by simply alternating how key and value sequences are constructed in the cross-attention layer, with minimum modifications to the original models. Besides, we introduce a simple yet effective label augmentation method to provide richer supervision and improve data efficiency. Experiments show that our method can be readily applied to different detection transformers and improve their performance on both small-size and sample-rich datasets.\n\n![Label Augmentation](./figs/label_aug.png)\n\n## Main Results\n\nThe experimental results and model weights trained on Cityscapes are shown below.\n\n|       Model       |  mAP  | mAP@50 | mAP@75 | mAP@S | mAP@M | mAP@L | Log \u0026 Model |\n| :----------------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: |\n| CondDETR | 12.5 | 29.6 | 9.1 | 2.2 | 10.5 | 27.5 | [Google Drive](https://drive.google.com/drive/folders/1J7gFZJlAuf6jaKYrrzZjh2qnU0FIFR1z?usp=sharing) |\n| DE-CondDETR | 27.2 | 48.4 | 25.8 | 6.9 | 26.1 | 46.9 | [Google Drive](https://drive.google.com/drive/folders/18yEPHxHeNApcDsK4xOz015ff5fhSjILz?usp=sharing) |\n| DELA-CondDETR | 29.8 | 52.8 | 28.7 | 7.7 | 27.9 | 50.2 | [Google Drive](https://drive.google.com/drive/folders/1LvhEo-mnxPKQylGWqtGhFmWQjzwZNKyF?usp=sharing) |\n\nThe experimental results and model weights trained on COCO 2017 are shown below.\n\n|       Model       |  mAP  | mAP@50 | mAP@75 | mAP@S | mAP@M | mAP@L | Log \u0026 Model |\n| :----------------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: |\n| CondDETR | 40.2 | 61.1 | 42.6 | 19.9 | 43.6 | 58.7 | [Google Drive](https://drive.google.com/drive/folders/158hBEQ2sa2ow_vpDpPNu9ZEalja1Fhns?usp=sharing) |\n| DE-CondDETR | 41.7 | 62.4 | 44.9 | 24.4 | 44.5 | 56.3 | [Google Drive](https://drive.google.com/drive/folders/1qy1H6ZOg7uKNZvk9NDaZqKsorR28kzOo?usp=sharing) |\n| DELA-CondDETR | 43.0 | 64.0 | 46.4 | 26.0 | 45.5 | 57.7 | [Google Drive](https://drive.google.com/drive/folders/13_P1blcs-HG8YbxA4BHtU8GObXIb9BuZ?usp=sharing) |\n\n*Note:*\n\n1. All models are trained for 50 epochs.\n2. The performance of the model weights on Cityscapes is slightly different from that reported in the paper, because the results in the paper are the average of five repeated runs with different random seeds.\n\n## Installation\n\n### Requirements\n\n* Linux, CUDA\u003e=9.2, GCC\u003e=5.4\n  \n* Python\u003e=3.7\n  \n* PyTorch\u003e=1.7.0, torchvision\u003e=0.6.0 (following instructions [here](https://pytorch.org/))\n\n* Detectron2\u003e=0.5 for RoIAlign (following instructions [here](https://detectron2.readthedocs.io/en/latest/tutorials/install.html))\n\n* Other requirements\n    ```bash\n    pip install -r requirements.txt\n    ```\n\n## Usage\n\n### Dataset preparation\n\nThe COCO 2017 dataset can be downloaded from [here](https://cocodataset.org) and the Cityscapes datasets can be downloaded from [here](https://www.cityscapes-dataset.com/login/). The annotations in COCO format can be obtained from [here](https://drive.google.com/drive/folders/1mRrJT-CjVwNbQ6iRt4VdZguXrH9iJx9i?usp=sharing). Afterward, please organize the datasets and annotations as following:\n\n```\ndata\n└─ cityscapes\n   └─ leftImg8bit\n      |─ train\n      └─ val\n└─ coco\n   |─ annotations\n   |─ train2017\n   └─ val2017\n└─ CocoFormatAnnos\n   |─ cityscapes_train_cocostyle.json\n   |─ cityscapes_val_cocostyle.json\n   |─ instances_train2017_sample11828.json\n   |─ instances_train2017_sample5914.json\n   |─ instances_train2017_sample2365.json\n   └─ instances_train2017_sample1182.json\n```\n\nThe annotations for down-sampled COCO 2017 dataset is generated using ```utils/downsample_coco.py```\n\n### Training\n\n#### Training DELA-CondDETR on Cityscapes\n\n```bash\npython -m torch.distributed.launch --nproc_per_node=2 --master_port=29501 --use_env main.py --dataset_file cityscapes --coco_path data/cityscapes --batch_size 4 --model dela-cond-detr --repeat_label 2 --nms --wandb\n```\n\n#### Training DELA-CondDETR on down-sampled COCO 2017, with e.g. sample_rate=0.01\n\n```bash\npython -m torch.distributed.launch --nproc_per_node=2 --master_port=29501 --use_env main.py --dataset_file cocodown --coco_path data/coco --sample_rate 0.01 --batch_size 4 --model dela-cond-detr --repeat_label 2 --nms --wandb\n```\n\n#### Training DELA-CondDETR on COCO 2017\n\n```bash\npython -m torch.distributed.launch --nproc_per_node=8 --master_port=29501 --use_env main.py --dataset_file coco --coco_path data/coco --batch_size 4 --model dela-cond-detr --repeat_label 2 --nms --wandb\n```\n\n#### Training DE-CondDETR on Cityscapes\n\n```bash\npython -m torch.distributed.launch --nproc_per_node=2 --master_port=29501 --use_env main.py --dataset_file cityscapes --coco_path data/cityscapes --batch_size 4 --model de-cond-detr --wandb\n```\n\n#### Training CondDETR baseline\nPlease refer to the [cond_detr](https://github.com/encounter1997/DE-CondDETR/tree/cond_detr) branch.\n\n### Evaluation\n\nYou can get the pretrained model (the link is in \"Main Results\" session), then run following command to evaluate it on the validation set:\n\n```bash\n\u003ctraining command\u003e --resume \u003cpath to pre-trained model\u003e --eval\n```\n\n## Acknowledgement \n\nThis project is based on [DETR](https://github.com/facebookresearch/detr), [Conditional DETR](https://github.com/Atten4Vis/ConditionalDETR), and [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR). Thanks for their wonderful works. See [LICENSE](./LICENSE) for more details. \n\n\n## Citing DE-DETRs\nIf you find DE-DETRs useful in your research, please consider citing:\n```bibtex\n@inproceedings{wang2022towards,\n  title     =  {Towards Data-Efficient Detection Transformers},\n  author    =  {Wen Wang and Jing Zhang and Yang Cao and Yongliang Shen and Dacheng Tao},\n  booktitle =  {Proc. Eur. Conf. Computer Vision (ECCV)},\n  year      =  {2022}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fencounter1997%2Fde-conddetr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fencounter1997%2Fde-conddetr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fencounter1997%2Fde-conddetr/lists"}