{"id":23913055,"url":"https://github.com/henghuiding/rela","last_synced_at":"2025-04-04T16:13:06.220Z","repository":{"id":171769856,"uuid":"612610297","full_name":"henghuiding/ReLA","owner":"henghuiding","description":"[CVPR2023 Highlight] GRES: Generalized Referring Expression Segmentation","archived":false,"fork":false,"pushed_at":"2023-09-05T03:41:53.000Z","size":2155,"stargazers_count":692,"open_issues_count":7,"forks_count":19,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-28T15:04:47.226Z","etag":null,"topics":["cvpr2023","multimodal-learning","referring-expression-comprehension","referring-expression-segmentation","referring-image-segmentation","vision-language-transformer"],"latest_commit_sha":null,"homepage":"https://henghuiding.github.io/GRES/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/henghuiding.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-11T13:13:50.000Z","updated_at":"2025-03-26T12:41:44.000Z","dependencies_parsed_at":"2025-01-05T09:21:27.613Z","dependency_job_id":"6ad0b257-e3bb-403a-913b-6fb4e3223665","html_url":"https://github.com/henghuiding/ReLA","commit_stats":null,"previous_names":["henghuiding/rela"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henghuiding%2FReLA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henghuiding%2FReLA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henghuiding%2FReLA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henghuiding%2FReLA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/henghuiding","download_url":"https://codeload.github.com/henghuiding/ReLA/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247208139,"owners_count":20901570,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cvpr2023","multimodal-learning","referring-expression-comprehension","referring-expression-segmentation","referring-image-segmentation","vision-language-transformer"],"created_at":"2025-01-05T09:20:48.986Z","updated_at":"2025-04-04T16:13:06.201Z","avatar_url":"https://github.com/henghuiding.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GRES: Generalized Referring Expression Segmentation\n[![PyTorch](https://img.shields.io/badge/PyTorch-1.11.0-%23EE4C2C.svg?style=\u0026logo=PyTorch\u0026logoColor=white)](https://pytorch.org/)\n[![Python](https://img.shields.io/badge/Python-3.7%20|%203.8%20|%203.9-blue.svg?style=\u0026logo=python\u0026logoColor=ffdd54)](https://www.python.org/downloads/)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/gres-generalized-referring-expression-1/generalized-referring-expression-segmentation)](https://paperswithcode.com/sota/generalized-referring-expression-segmentation?p=gres-generalized-referring-expression-1)\n\n**[🏠[Project page]](https://henghuiding.github.io/GRES/)** \u0026emsp; **[📄[arXiv]](https://arxiv.org/abs/2306.00968)**  \u0026emsp; **[📄[PDF]](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_GRES_Generalized_Referring_Expression_Segmentation_CVPR_2023_paper.pdf)** \u0026emsp; **[🔥[New Dataset Download]](https://github.com/henghuiding/gRefCOCO)**\n\nThis repository contains code for **CVPR2023** paper:\n\u003e [GRES: Generalized Referring Expression Segmentation](https://arxiv.org/abs/2306.00968)  \n\u003e Chang Liu, Henghui Ding, Xudong Jiang  \n\u003e CVPR 2023 Highlight, Acceptance Rate 2.5%\n\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/henghuiding/ReLA/blob/main/imgs/fig1.png?raw=true\" width=\"100%\" height=\"100%\"/\u003e\n\u003c/div\u003e\u003cbr/\u003e\n\n## Update\n- **(2023/08/29)** We have updated and reorganized the dataset file. Please download the latest version for train/val/testA/testB! (Note: training expressions are unchanged so the this does not influence training. But some `ref_id` and `sent_id` are re-numbered for better organization.) \n- **(2023/08/16)** A new large-scale referring video segmentation dataset [MeViS](https://henghuiding.github.io/MeViS/) is released.\n\n## Installation:\n\nThe code is tested under CUDA 11.8, Pytorch 1.11.0 and Detectron2 0.6.\n\n1. Install [Detectron2](https://github.com/facebookresearch/detectron2) following the [manual](https://detectron2.readthedocs.io/en/latest/)\n2. Run `sh make.sh` under `gres_model/modeling/pixel_decoder/ops`\n3. Install other required packages: `pip -r requirements.txt`\n4. Prepare the dataset following `datasets/DATASET.md`\n\n## Inference\n\n```\npython train_net.py \\\n    --config-file configs/referring_swin_base.yaml \\\n    --num-gpus 8 --dist-url auto --eval-only \\\n    MODEL.WEIGHTS [path_to_weights] \\\n    OUTPUT_DIR [output_dir]\n```\n\n## Training\n\nFirstly, download the backbone weights (`swin_base_patch4_window12_384_22k.pkl`) and convert it into detectron2 format using the script:\n\n```\nwget https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_base_patch4_window12_384_22k.pth\npython tools/convert-pretrained-swin-model-to-d2.py swin_base_patch4_window12_384_22k.pth swin_base_patch4_window12_384_22k.pkl\n```\n\nThen start training:\n```\npython train_net.py \\\n    --config-file configs/referring_swin_base.yaml \\\n    --num-gpus 8 --dist-url auto \\\n    MODEL.WEIGHTS [path_to_weights] \\\n    OUTPUT_DIR [path_to_weights]\n```\n\nNote: You can add your own configurations subsequently to the training command for customized options. For example:\n\n```\nSOLVER.IMS_PER_BATCH 48 \nSOLVER.BASE_LR 0.00001 \n```\n\nFor the full list of base configs, see `configs/referring_R50.yaml` and `configs/Base-COCO-InstanceSegmentation.yaml`\n\n\n## Models\n\nUpdate: We have added supports for ResNet-50 and Swin-Tiny backbones! Feel free to use and report these resource-friendly models in your work.\n\n| Backbone | cIoU | gIoU |\n|---|---|---|\n| Resnet-50 | 39.53 | 38.62 |\n| Swin-Tiny | 57.73 | 56.86 |\n| Swin-Base | 62.42 | 63.60 |\n\nAll models can be downloaded from:\n\n[Onedrive](https://entuedu-my.sharepoint.com/:f:/g/personal/liuc0058_e_ntu_edu_sg/EqyL6nftLjdIihQG2rYirPoB1Sm3HBJwuZgtPII8WcevQw?e=pI1rrg)\n\n## Acknowledgement\n\nThis project is based on [refer](https://github.com/lichengunc/refer), [Mask2Former](https://github.com/facebookresearch/Mask2Former), [Detectron2](https://github.com/facebookresearch/detectron2), [VLT](https://github.com/henghuiding/Vision-Language-Transformer). Many thanks to the authors for their great works!\n\n## BibTeX\nPlease consider to cite GRES if it helps your research.\n\n```bibtex\n@inproceedings{GRES,\n  title={{GRES}: Generalized Referring Expression Segmentation},\n  author={Liu, Chang and Ding, Henghui and Jiang, Xudong},\n  booktitle={CVPR},\n  year={2023}\n}\n@article{VLT,\n  title={{VLT}: Vision-language transformer and query generation for referring segmentation},\n  author={Ding, Henghui and Liu, Chang and Wang, Suchen and Jiang, Xudong},\n  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},\n  year={2023},\n  publisher={IEEE}\n}\n@inproceedings{MeViS,\n  title={{MeViS}: A Large-scale Benchmark for Video Segmentation with Motion Expressions},\n  author={Ding, Henghui and Liu, Chang and He, Shuting and Jiang, Xudong and Loy, Chen Change},\n  booktitle={ICCV},\n  year={2023}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhenghuiding%2Frela","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhenghuiding%2Frela","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhenghuiding%2Frela/lists"}