{"id":13711787,"url":"https://github.com/VisionXLab/sam-mmrotate","last_synced_at":"2025-05-06T21:32:12.883Z","repository":{"id":152196315,"uuid":"625075880","full_name":"VisionXLab/sam-mmrotate","owner":"VisionXLab","description":"SAM (Segment Anything Model) for generating rotated bounding boxes with MMRotate, which is a comparison method of H2RBox-v2.","archived":false,"fork":false,"pushed_at":"2023-07-31T13:46:17.000Z","size":55,"stargazers_count":185,"open_issues_count":0,"forks_count":14,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-04-02T03:11:49.290Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/VisionXLab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-08T02:12:12.000Z","updated_at":"2025-03-12T03:03:50.000Z","dependencies_parsed_at":null,"dependency_job_id":"0871a470-24ab-464e-ac1e-99277f303d03","html_url":"https://github.com/VisionXLab/sam-mmrotate","commit_stats":null,"previous_names":["visionxlab/sam-mmrotate"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VisionXLab%2Fsam-mmrotate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VisionXLab%2Fsam-mmrotate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VisionXLab%2Fsam-mmrotate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VisionXLab%2Fsam-mmrotate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/VisionXLab","download_url":"https://codeload.github.com/VisionXLab/sam-mmrotate/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252772138,"owners_count":21801861,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T23:01:11.627Z","updated_at":"2025-05-06T21:32:12.487Z","avatar_url":"https://github.com/VisionXLab.png","language":"Python","readme":"# SAM-RBox \r\nThis is an implementation of [SAM (Segment Anything Model)](https://github.com/facebookresearch/segment-anything) for generating rotated bounding boxes with [MMRotate](https://github.com/open-mmlab/mmrotate), which is a comparison method of [H2RBox-v2: Boosting HBox-supervised Oriented Object Detection via Symmetric Learning](https://arxiv.org/abs/2304.04403).\r\n\r\n**NOTE:** This project has been involved into OpenMMLab's new repo [**_PlayGround_**](https://github.com/open-mmlab/playground). For more details, please refer to [this](https://github.com/open-mmlab/playground/blob/main/mmrotate_sam/README.md).\r\n\r\n\u003cdiv align=center\u003e\r\n\u003cimg src=\"https://user-images.githubusercontent.com/79644233/231636420-8b7f81f3-51d2-439c-87cc-6f7eebd32193.png\"/\u003e\r\n\u003c/div\u003e\r\n\r\nRecently, [SAM](https://arxiv.org/abs/2304.02643) has demonstrated strong zero-shot capabilities by training on the largest segmentation dataset to date. Thus, we use a trained horizontal FCOS detector to provide HBoxes into SAM as prompts, so that corresponding Masks can be generated by zero-shot, and finally the rotated RBoxes are obtained by performing the minimum circumscribed rectangle operation on the predicted Masks. Thanks to the powerful zero-shot capability, SAM-RBox based on ViT-B has achieved 63.94%. However, it is also limited to the time-consuming post-processing, only 1.7 FPS during inference.\r\n\r\n\r\n![image](https://user-images.githubusercontent.com/79644233/230732578-649086b4-7720-4450-9e87-25873bec07cb.png)\r\n![image](https://user-images.githubusercontent.com/29257168/230749605-f6584336-a69b-47e8-95ab-87669ca9baf0.png)\r\n\r\n## Prepare Env\r\n\r\nThe code is based on MMRotate 1.x and official API of SAM.\r\n\r\nHere is the installation commands of recommended environment.\r\n```bash\r\npip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 -f https://download.pytorch.org/whl/cu111/torch_stable.html\r\n\r\npip install openmim\r\nmim install mmengine 'mmcv\u003e=2.0.0rc0' 'mmrotate\u003e=1.0.0rc0'\r\n\r\npip install git+https://github.com/facebookresearch/segment-anything.git\r\npip install opencv-python pycocotools matplotlib onnxruntime onnx\r\n```\r\n\r\n## Note\r\n1. Prepare DOTA data set according to MMRotate doc.\r\n2. Download the detector weight from MMRotate model zoo.\r\n3. `python main_sam_dota.py` prompts SAM with HBox obtained from annotation file (such as DOTA trainval).\r\n4. `python main_rdet-sam_dota.py` prompts SAM with HBox predicted by a well-trained detector for non-annotated data (such as DOTA test).\r\n5. Many configs, including pipeline (i.e. transforms), dataset, dataloader, evaluator, visualizer, are set in `data.py`.\r\n6. You can change the detector config and the corresponding weight path in `main_rdet-sam_dota.py` to any detector that can be built with MMRotate.\r\n\r\n## Citation\r\n```\r\n@article{yu2023h2rboxv2,\r\n  title={H2RBox-v2: Boosting HBox-supervised Oriented Object Detection via Symmetric Learning},\r\n  author={Yu, Yi and Yang, Xue and Li, Qingyun and Zhou, Yue and Zhang, Gefan and Yan, Junchi and Da, Feipeng},\r\n  journal={arXiv preprint arXiv:2304.04403},\r\n  year={2023}\r\n}\r\n\r\n@inproceedings{yang2023h2rbox,\r\n  title={H2RBox: Horizontal Box Annotation is All You Need for Oriented Object Detection},\r\n  author={Yang, Xue and Zhang, Gefan and Li, Wentong and Wang, Xuehui and Zhou, Yue and Yan, Junchi},\r\n  booktitle={International Conference on Learning Representations},\r\n  year={2023}\r\n}\r\n\r\n@article{kirillov2023segany,\r\n  title={Segment Anything}, \r\n  author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\\'a}r, Piotr and Girshick, Ross},\r\n  journal={arXiv:2304.02643},\r\n  year={2023}\r\n}\r\n```\r\n\r\n### Other awesome SAM projects:\r\n- [Grounded-Segment-Anything](https://github.com/IDEA-Research/Grounded-Segment-Anything)\r\n- [Zero-Shot Anomaly Detection](https://github.com/caoyunkang/GroundedSAM-zero-shot-anomaly-detection)\r\n- [EditAnything: ControlNet + StableDiffusion based on the SAM segmentation mask](https://github.com/sail-sg/EditAnything)\r\n- [IEA: Image Editing Anything](https://github.com/feizc/IEA)\r\n- [sam-with-mmdet](https://github.com/liuyanyi/sam-with-mmdet) (mmdet 3.0.0, provide RTMDet)\r\n- [Prompt-Segment-Anything](https://github.com/RockeyCoss/Prompt-Segment-Anything) (mmdet 3.0.0, H-DETR, DINO, Focal backbone)\r\n......\r\n","funding_links":[],"categories":["Recent Works","Application","Open Source Projects"],"sub_categories":["Image Detection/Segmentation","Follow-up Papers"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FVisionXLab%2Fsam-mmrotate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FVisionXLab%2Fsam-mmrotate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FVisionXLab%2Fsam-mmrotate/lists"}