{"id":19932246,"url":"https://github.com/amazon-science/unified-ept","last_synced_at":"2025-05-03T11:31:42.824Z","repository":{"id":47500960,"uuid":"399249263","full_name":"amazon-science/unified-ept","owner":"amazon-science","description":"A Unified Efficient Pyramid Transformer for Semantic Segmentation, ICCVW 2021","archived":false,"fork":false,"pushed_at":"2021-10-11T15:01:51.000Z","size":6878,"stargazers_count":31,"open_issues_count":3,"forks_count":8,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-07T15:04:24.586Z","etag":null,"topics":["efficient","iccv-2021","pyramid","semantic-segmentation","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amazon-science.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-08-23T21:13:02.000Z","updated_at":"2024-05-22T13:04:59.000Z","dependencies_parsed_at":"2022-09-16T23:40:15.899Z","dependency_job_id":null,"html_url":"https://github.com/amazon-science/unified-ept","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Funified-ept","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Funified-ept/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Funified-ept/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Funified-ept/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amazon-science","download_url":"https://codeload.github.com/amazon-science/unified-ept/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252184334,"owners_count":21707938,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["efficient","iccv-2021","pyramid","semantic-segmentation","transformers"],"created_at":"2024-11-12T23:09:29.595Z","updated_at":"2025-05-03T11:31:42.527Z","avatar_url":"https://github.com/amazon-science.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Unified-EPT\n\nCode for the ICCV 2021 Workshop paper: [A Unified Efficient Pyramid Transformer for Semantic Segmentation](https://openaccess.thecvf.com/content/ICCV2021W/VSPW/papers/Zhu_A_Unified_Efficient_Pyramid_Transformer_for_Semantic_Segmentation_ICCVW_2021_paper.pdf).\n\n## Installation\n\n* Linux, CUDA\u003e=10.0, GCC\u003e=5.4\n* Python\u003e=3.7\n* Create a conda environment:\n\n```bash\n    conda create -n unept python=3.7 pip\n```\n\nThen, activate the environment:\n```bash\n    conda activate unept\n```\n* PyTorch\u003e=1.5.1, torchvision\u003e=0.6.1 (following instructions [here](https://pytorch.org/))\n\nFor example:\n```\nconda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.2 -c pytorch\n```\n\n* Install [MMCV](https://mmcv.readthedocs.io/en/latest/), [MMSegmentation](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/install.md), [timm](https://pypi.org/project/timm/)\n\n```\npip install -r requirements.txt\n```\n\n* Install [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR) and compile the CUDA operators\n(the instructions can be found [here](https://github.com/fundamentalvision/Deformable-DETR#installation)).\n\n\n\n## Data Preparation\nPlease following the code from [openseg](https://github.com/openseg-group/openseg.pytorch) to generate ground truth for boundary refinement. \n\nThe data format should be like this.\n\n### ADE20k\nYou can download the processed ```dt_offset``` file [here](https://drive.google.com/drive/folders/1UKIXzc6hHQUfNqynZtcgSjSnGpQJ0GLs?usp=sharing). \n\n```\npath/to/ADEChallengeData2016/\n  images/\n    training/\n    validation/\n  annotations/ \n    training/\n    validation/\n  dt_offset/\n    training/\n    validation/\n```\n### PASCAL-Context\nYou can download the processed dataset [here](https://drive.google.com/file/d/18-3ySBQEZcBfr0Rs3_mWJWo2jNzyS6VO/view?usp=sharing).\n\n```\npath/to/PASCAL-Context/\n  train/\n    image/\n    label/\n    dt_offset/\n  val/\n    image/\n    label/\n    dt_offset/\n```\n\n## Usage \n### Training \n**The default is for multi-gpu, DistributedDataParallel training.**\n\n```\npython -m torch.distributed.launch --nproc_per_node=8 \\ # specify gpu number\n--master_port=29500  \\\ntrain.py  --launcher pytorch \\\n--config /path/to/config_file \n```\n\n- specify the ```data_root``` in the config file;\n- log dir will be created in ```./work_dirs```;\n- download the [DeiT pretrained model](https://dl.fbaipublicfiles.com/deit/deit_base_distilled_patch16_384-d0272ac0.pth) and specify the ```pretrained``` path in the config file.\n\n\n### Evaluation\n\n```\n# single-gpu testing\npython test.py --checkpoint /path/to/checkpoint \\\n--config /path/to/config_file \\\n--eval mIoU \\\n[--out ${RESULT_FILE}] [--show] \\\n--aug-test \\ # for multi-scale flip aug\n\n# multi-gpu testing (4 gpus, 1 sample per gpu)\npython -m torch.distributed.launch --nproc_per_node=4 --master_port=29500 \\\ntest.py  --launcher pytorch --eval mIoU \\\n--config_file /path/to/config_file \\\n--checkpoint /path/to/checkpoint \\\n--aug-test \\ # for multi-scale flip aug\n```\n\n## Results\nWe report results on validation sets.\n\n| Backbone | Crop Size | Batch Size | Dataset | Lr schd | Mem(GB) | mIoU(ms+flip) | config |\n| :------: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | \n| Res-50 | 480x480 | 16 | ADE20K | 160K | 7.0G | 46.1 | [config](configs/res50_unept_ade20k.py) |\n| DeiT | 480x480 | 16 | ADE20K | 160K | 8.5G | 50.5 | [config](configs/deit_unept_ade20k.py) |\n| DeiT | 480x480 | 16 | PASCAL-Context | 160K | 8.5G | 55.2 | [config](configs/deit_unept_pcontext.py) |\n\n## Security\nSee [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.\n\n\n## License\n\nThis project is licensed under the Apache-2.0 License.\n\n## Citation\n\nIf you use this code and models for your research, please consider citing:\n\n```\n@article{zhu2021unified,\n  title={A Unified Efficient Pyramid Transformer for Semantic Segmentation},\n  author={Zhu, Fangrui and Zhu, Yi and Zhang, Li and Wu, Chongruo and Fu, Yanwei and Li, Mu},\n  journal={arXiv preprint arXiv:2107.14209},\n  year={2021}\n}\n```\n\n## Acknowledgment\n\nWe thank the authors and contributors of [MMCV](https://mmcv.readthedocs.io/en/latest/), [MMSegmentation](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/install.md), [timm](https://pypi.org/project/timm/) and [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famazon-science%2Funified-ept","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famazon-science%2Funified-ept","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famazon-science%2Funified-ept/lists"}