Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/amazon-science/unified-ept
A Unified Efficient Pyramid Transformer for Semantic Segmentation, ICCVW 2021
https://github.com/amazon-science/unified-ept
efficient iccv-2021 pyramid semantic-segmentation transformers
Last synced: 8 days ago
JSON representation
A Unified Efficient Pyramid Transformer for Semantic Segmentation, ICCVW 2021
- Host: GitHub
- URL: https://github.com/amazon-science/unified-ept
- Owner: amazon-science
- License: apache-2.0
- Created: 2021-08-23T21:13:02.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2021-10-11T15:01:51.000Z (about 3 years ago)
- Last Synced: 2023-03-11T11:52:35.612Z (over 1 year ago)
- Topics: efficient, iccv-2021, pyramid, semantic-segmentation, transformers
- Language: Python
- Homepage:
- Size: 6.56 MB
- Stars: 29
- Watchers: 2
- Forks: 8
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# Unified-EPT
Code for the ICCV 2021 Workshop paper: [A Unified Efficient Pyramid Transformer for Semantic Segmentation](https://openaccess.thecvf.com/content/ICCV2021W/VSPW/papers/Zhu_A_Unified_Efficient_Pyramid_Transformer_for_Semantic_Segmentation_ICCVW_2021_paper.pdf).
## Installation
* Linux, CUDA>=10.0, GCC>=5.4
* Python>=3.7
* Create a conda environment:```bash
conda create -n unept python=3.7 pip
```Then, activate the environment:
```bash
conda activate unept
```
* PyTorch>=1.5.1, torchvision>=0.6.1 (following instructions [here](https://pytorch.org/))For example:
```
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.2 -c pytorch
```* Install [MMCV](https://mmcv.readthedocs.io/en/latest/), [MMSegmentation](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/install.md), [timm](https://pypi.org/project/timm/)
```
pip install -r requirements.txt
```* Install [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR) and compile the CUDA operators
(the instructions can be found [here](https://github.com/fundamentalvision/Deformable-DETR#installation)).## Data Preparation
Please following the code from [openseg](https://github.com/openseg-group/openseg.pytorch) to generate ground truth for boundary refinement.The data format should be like this.
### ADE20k
You can download the processed ```dt_offset``` file [here](https://drive.google.com/drive/folders/1UKIXzc6hHQUfNqynZtcgSjSnGpQJ0GLs?usp=sharing).```
path/to/ADEChallengeData2016/
images/
training/
validation/
annotations/
training/
validation/
dt_offset/
training/
validation/
```
### PASCAL-Context
You can download the processed dataset [here](https://drive.google.com/file/d/18-3ySBQEZcBfr0Rs3_mWJWo2jNzyS6VO/view?usp=sharing).```
path/to/PASCAL-Context/
train/
image/
label/
dt_offset/
val/
image/
label/
dt_offset/
```## Usage
### Training
**The default is for multi-gpu, DistributedDataParallel training.**```
python -m torch.distributed.launch --nproc_per_node=8 \ # specify gpu number
--master_port=29500 \
train.py --launcher pytorch \
--config /path/to/config_file
```- specify the ```data_root``` in the config file;
- log dir will be created in ```./work_dirs```;
- download the [DeiT pretrained model](https://dl.fbaipublicfiles.com/deit/deit_base_distilled_patch16_384-d0272ac0.pth) and specify the ```pretrained``` path in the config file.### Evaluation
```
# single-gpu testing
python test.py --checkpoint /path/to/checkpoint \
--config /path/to/config_file \
--eval mIoU \
[--out ${RESULT_FILE}] [--show] \
--aug-test \ # for multi-scale flip aug# multi-gpu testing (4 gpus, 1 sample per gpu)
python -m torch.distributed.launch --nproc_per_node=4 --master_port=29500 \
test.py --launcher pytorch --eval mIoU \
--config_file /path/to/config_file \
--checkpoint /path/to/checkpoint \
--aug-test \ # for multi-scale flip aug
```## Results
We report results on validation sets.| Backbone | Crop Size | Batch Size | Dataset | Lr schd | Mem(GB) | mIoU(ms+flip) | config |
| :------: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: |
| Res-50 | 480x480 | 16 | ADE20K | 160K | 7.0G | 46.1 | [config](configs/res50_unept_ade20k.py) |
| DeiT | 480x480 | 16 | ADE20K | 160K | 8.5G | 50.5 | [config](configs/deit_unept_ade20k.py) |
| DeiT | 480x480 | 16 | PASCAL-Context | 160K | 8.5G | 55.2 | [config](configs/deit_unept_pcontext.py) |## Security
See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.## License
This project is licensed under the Apache-2.0 License.
## Citation
If you use this code and models for your research, please consider citing:
```
@article{zhu2021unified,
title={A Unified Efficient Pyramid Transformer for Semantic Segmentation},
author={Zhu, Fangrui and Zhu, Yi and Zhang, Li and Wu, Chongruo and Fu, Yanwei and Li, Mu},
journal={arXiv preprint arXiv:2107.14209},
year={2021}
}
```## Acknowledgment
We thank the authors and contributors of [MMCV](https://mmcv.readthedocs.io/en/latest/), [MMSegmentation](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/install.md), [timm](https://pypi.org/project/timm/) and [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR).