{"id":13466911,"url":"https://github.com/amazon-science/omni-detr","last_synced_at":"2025-10-09T23:06:49.465Z","repository":{"id":37406901,"uuid":"473821808","full_name":"amazon-science/omni-detr","owner":"amazon-science","description":"PyTorch implementation of Omni-DETR for omni-supervised object detection: https://arxiv.org/abs/2203.16089","archived":false,"fork":false,"pushed_at":"2022-09-26T05:14:48.000Z","size":215,"stargazers_count":69,"open_issues_count":11,"forks_count":7,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-10-05T11:24:07.259Z","etag":null,"topics":["object-detection","omni-supervised-learning","semi-supervised-learning","weakly-supervised-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amazon-science.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-03-25T00:55:20.000Z","updated_at":"2025-09-14T06:25:03.000Z","dependencies_parsed_at":"2022-07-12T16:17:50.560Z","dependency_job_id":null,"html_url":"https://github.com/amazon-science/omni-detr","commit_stats":null,"previous_names":["amazon-research/omni-detr"],"tags_count":0,"template":false,"template_full_name":"amazon-archives/__template_Apache-2.0","purl":"pkg:github/amazon-science/omni-detr","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fomni-detr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fomni-detr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fomni-detr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fomni-detr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amazon-science","download_url":"https://codeload.github.com/amazon-science/omni-detr/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fomni-detr/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279002341,"owners_count":26083340,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["object-detection","omni-supervised-learning","semi-supervised-learning","weakly-supervised-learning"],"created_at":"2024-07-31T15:00:51.294Z","updated_at":"2025-10-09T23:06:49.439Z","avatar_url":"https://github.com/amazon-science.png","language":"Python","funding_links":[],"categories":["Literature List"],"sub_categories":[],"readme":"# Omni-DETR: Omni-Supervised Object Detection with Transformers\n\nThis is the PyTorch implementation of the [Omni-DETR](https://assets.amazon.science/91/3c/ac87e7dd44789a62e03b2230e0ed/omni-detr-omni-supervised-object-detection-with-transformers.pdf) paper. It is a unified framework to use different types of weak annotations for object detection.\n\nIf you use the code/model/results of this repository please cite:\n```\n@inproceedings{wang2022omni,\n  author  = {Pei Wang and Zhaowei Cai and Hao Yang and Gurumurthy Swaminathan and Nuno Vasconcelos and Bernt Schiele and Stefano Soatto},\n  title   = {Omni-DETR: Omni-Supervised Object Detection with Transformers},\n  booktitle = {CVPR},\n  Year  = {2022}\n}\n```\n\n## Installation\n\nFirst, install PyTorch and torchvision. We have tested on version of 1.8.1, but the other versions should also be working, e.g. no earlier than 1.5.1.\n\nOur implementation is partially based on [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR/). Please follow its [instruction](https://github.com/fundamentalvision/Deformable-DETR/blob/main/README.md) for other requirements.\n\n## Usage\n\n### Dataset organization\n\nPlease organize each dataset as follows,\n\n```\ncode_root/\n└── coco/\n  ├── train2017/\n  ├── val2017/\n  ├── train2014/\n  ├── val2014/\n  └── annotations/\n    ├── instances_train2017.json\n    ├── instances_val2017.json\n    ├── instances_valminusminival2014.json\n    └── instances_train2014.json\n└── voc/\n  └── VOCdevkit/\n    └── VOC2007trainval\n      ├── Annotations/\n      ├── JPEGImages/\n    └── VOC2012trainval/\n      ├── Annotations/\n      ├── JPEGImages/\n    └── VOC2007test/\n      ├── Annotations/\n      ├── JPEGImages/\n    └── VOC20072012trainval/\n      ├── Annotations/\n      ├── JPEGImages/\n └── objects365/\n     ├── train_objects365/\n        ├── objects365_v1_00000000.jpg\n        ├── ...\n     ├── val_objects365/\n        ├── objects365_v1_00000016.jpg\n        ├── ...\n     └── annotations/\n        ├── objects365_train.json\n        └── objects365_val.json\n └── bees/\n     └── ML-Data/\n └── crowdhuman/\n    ├── Images/\n      |── 273271,1a0d6000b9e1f5b7.jpg\n      |── ...\n    ├── annotation_train.odgt\n    └── annotation_val.odgt\n      \n```\n\n### Dataset preparation\nFirst go to ``scripts`` folder\n\n```\ncd scripts\n```\n\n#### COCO\nTo get the split labeled and omni-labeled datasets\n```\npython split_dataset_coco_omni.py\n```\nAdd indicator to coco val set\n```\npython add_indicator_to_coco2017_val.py\n```\nFor experiments compared with UFO, we prepare coco2014 set\n```\npython add_indicator_to_coco2014.py\n```\n#### VOC\nFirst need to convert the annotation formats to coco style by\n```\npython VOC2COCO.py --xml_dir ../voc/VOCdevkit/VOC2007trainval/Annotations --json_file ../voc/VOCdevkit/VOC2007trainval/instances_VOC_trainval2007.json\npython VOC2COCO.py --xml_dir ../voc/VOCdevkit/VOC2007test/Annotations --json_file ../voc/VOCdevkit/VOC2007test/instances_VOC_test2007.json\npython VOC2COCO.py --xml_dir ../voc/VOCdevkit/VOC2012trainval/Annotations --json_file ../voc/VOCdevkit/VOC2012trainval/instances_VOC_trainval2012.json\n```\nTo combine the annotations of voc07 and voc12 by\n```\npython combine_voc_trainval20072012.py\n```\nAdd indicator to voc07 and 12\n```\npython prepare_voc_dataset.py\n```\nTo get the split labeled and omni-labeled datasets\n```\npython split_dataset_voc_omni.py\n```\n\n\n#### Objects365\nFirst sample a subset from the original whole training set\n```\npython prepare_objects365_for_omni.py\n```\nAdd indicator to val\n```\npython add_indicator_to_objects365val.py\n```\nTo get the split labeled and omni-labeled datasets\n```\npython split_dataset_objects365_omni.py\n```\n\n#### Bees\nBecause the official training set has some broken images (with names from ``Erlen_Erlen_Hive_04_1264.jpg`` to ``Erlen_Erlen_Hive_04_1842.jpg``), we first need to \nmanually delete them or run\n```\nxargs rm -r file_list_to_remove.txt\n```\nFinally, 3596 samples are kept. Next, convert the annotation formats to coco style by\n```\npython Bees2COCO.py\n```\nTo split the training and validation set as 8:2\n```\npython split_bees_train_val.py\n```\nTo get the split labeled and omni-labeled datasets\n```\npython split_dataset_bees_omni.py\n```\n\n#### CrowdHuman\nPlease follow [repo](https://github.com/xingyizhou/CenterTrack/blob/master/readme/DATA.md) to first convert annotations with odgt format to coco format, or run\n```\npython convert_crowdhuman_to_coco.py\n```\nBecause we only focus on the full body detection of CrowdHuman, we first extract such annotation by\n```\npython build_crowdhuman_dataset.py\n```\nTo get the split labeled and omni-labeled datasets\n```\npython split_dataset_crowdhuman_omni.py\n```\n\n### Training Omni-DETR\nAfter preparing datasets, please change the arguments in the config files, such as ``annotation_json_label``, ``annotation_json_unlabel``, according to the name of the generated json file above. The ``BURN_IN_STEP`` argument sometimes also needs to be changed (please refer to our supplementary materials). In our experiments, this hyperparameter does not have a huge impact on the results.\n\nBecause semi-supervised learning is just a special case of omni-supervised learning, to generate semi-supervised results, please modify the ratio of ``fully_labeled`` and ``Unsup``, but set others as 0, when splitting the dataset.\n\nTraining Omni-DETR on each dataset (from the repo main folder)\n\n#### Training from scratch\n\n```\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/r50_ut_detr_omni_coco.sh\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/r50_ut_detr_omni_voc.sh\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/r50_ut_detr_omni_objects.sh\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/r50_ut_detr_omni_bees.sh\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/r50_ut_detr_omni_crowdhuman.sh\n```\n\n#### Training from Deformable DETR\nBecause our burn-in stage is totally same as Deformable DETR, it is acceptable to start from a Deformable DETR checkpoint to skip the burn-in stage. Just modify the ``resume`` argument in config file above.\n\n\nBefore running the above scripts, you may have to run the below to change access permissions,\n```\nchmod u+x ./tools/run_dist_launch.sh\nchmod u+x ./configs/r50_ut_detr_omni_coco.sh\n```\n\n### Training under the setting of COCO35to80\n```\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/r50_ut_detr_tagsU_ufo.sh\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/r50_ut_detr_point_ufo.sh\n```\n\n### Training under the setting of VOC07to12\n```\nGPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/r50_ut_detr_voc07to12_semi.sh\n```\n\n### Note\n1. Some of our experiments are on 800-pixels images by 8 * GPUs with 32G memory. If such memory is not affordable, please change the argument of ``pixels`` to 600. Then it can work on 8 * GPUs with 16G memory. \n2. This code could have some minor accuracy differences from our paper due to some implementation changes after the paper submission.\n\n## License\n\nThis project is under the Apache-2.0 license. See [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famazon-science%2Fomni-detr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famazon-science%2Fomni-detr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famazon-science%2Fomni-detr/lists"}