Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/hszhao/semseg

Semantic Segmentation in Pytorch
https://github.com/hszhao/semseg
Last synced: 1 day ago
JSON representation
Semantic Segmentation in Pytorch
Host: GitHub
URL: https://github.com/hszhao/semseg
Owner: hszhao
License: mit
Created: 2018-09-06T17:26:36.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2022-08-28T10:50:55.000Z (about 2 years ago)
Last Synced: 2024-08-06T03:02:25.932Z (3 months ago)
Language: Python
Size: 1.39 MB
Stars: 1,323
Watchers: 21
Forks: 244
Open Issues: 44
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        # PyTorch Semantic Segmentation

### Introduction

This repository is a PyTorch implementation for semantic segmentation / scene parsing. The code is easy to use for training and testing on various datasets. The codebase mainly uses ResNet50/101/152 as backbone and can be easily adapted to other basic classification structures. Implemented networks including [PSPNet](https://hszhao.github.io/projects/pspnet) and [PSANet](https://hszhao.github.io/projects/psanet), which ranked 1st places in [ImageNet Scene Parsing Challenge 2016 @ECCV16](http://image-net.org/challenges/LSVRC/2016/results), [LSUN Semantic Segmentation Challenge 2017 @CVPR17](https://blog.mapillary.com/product/2017/06/13/lsun-challenge.html) and [WAD Drivable Area Segmentation Challenge 2018 @CVPR18](https://bdd-data.berkeley.edu/wad-2018.html). Sample experimented datasets are [ADE20K](http://sceneparsing.csail.mit.edu), [PASCAL VOC 2012](http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6) and [Cityscapes](https://www.cityscapes-dataset.com).



### Update

- 2020.05.15: Branch `master`, use official [nn.SyncBatchNorm](https://pytorch.org/docs/master/nn.html#torch.nn.SyncBatchNorm), only multiprocessing training is supported, tested with pytorch 1.4.0.

- 2019.05.29: Branch `1.0.0`, both multithreading training ([nn.DataParallel](https://pytorch.org/docs/stable/nn.html#dataparallel)) and multiprocessing training ([nn.parallel.DistributedDataParallel](https://pytorch.org/docs/stable/_modules/torch/nn/parallel/distributed.html)) (**recommended**) are supported. And the later one is much faster. Use `syncbn` from [EncNet](https://github.com/zhanghang1989/PyTorch-Encoding) and [apex](https://github.com/NVIDIA/apex), tested with pytorch 1.0.0.

### Usage

1. Highlight:

   - Fast multiprocessing training ([nn.parallel.DistributedDataParallel](https://pytorch.org/docs/stable/_modules/torch/nn/parallel/distributed.html)) with official [nn.SyncBatchNorm](https://pytorch.org/docs/master/nn.html#torch.nn.SyncBatchNorm).

   - Better reimplementation results with well designed code structures.

   - All initialization models, trained models and predictions are [available](https://drive.google.com/open?id=15wx9vOM0euyizq-M1uINgN0_wjVRf9J3).

2. Requirement:

   - Hardware: 4-8 GPUs (better with >=11G GPU memory)

   - Software: PyTorch>=1.1.0, Python3, [tensorboardX](https://github.com/lanpa/tensorboardX), 

3. Clone the repository:

   ```shell

   git clone https://github.com/hszhao/semseg.git

   ```

4. Train:

   - Download related datasets and symlink the paths to them as follows (you can alternatively modify the relevant paths specified in folder `config`):

     ```

     cd semseg

     mkdir -p dataset

     ln -s /path_to_ade20k_dataset dataset/ade20k

     ```

   - Download ImageNet pre-trained [models]((https://drive.google.com/open?id=15wx9vOM0euyizq-M1uINgN0_wjVRf9J3)) and put them under folder `initmodel` for weight initialization. Remember to use the right dataset format detailed in [FAQ.md](./FAQ.md).

   - Specify the gpu used in config then do training:

     ```shell

     sh tool/train.sh ade20k pspnet50

     ```

   - If you are using [SLURM](https://slurm.schedmd.com/documentation.html) for nodes manager, uncomment lines in train.sh and then do training:

     ```shell

     sbatch tool/train.sh ade20k pspnet50

     ```

5. Test:

   - Download trained segmentation models and put them under folder specified in config or modify the specified paths.

   - For full testing (get listed performance):

     ```shell

     sh tool/test.sh ade20k pspnet50

     ```

   - **Quick demo** on one image:

     ```shell

     PYTHONPATH=./ python tool/demo.py --config=config/ade20k/ade20k_pspnet50.yaml --image=figure/demo/ADE_val_00001515.jpg TEST.scales '[1.0]'

     ```

6. Visualization: [tensorboardX](https://github.com/lanpa/tensorboardX) incorporated for better visualization.

   ```shell

   tensorboard --logdir=exp/ade20k

   ```

7. Other:

   - Resources: GoogleDrive [LINK](https://drive.google.com/open?id=15wx9vOM0euyizq-M1uINgN0_wjVRf9J3) contains shared models, visual predictions and data lists.

   - Models: ImageNet pre-trained models and trained segmentation models can be accessed. Note that our ImageNet pretrained models are slightly different from original [ResNet](https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py) implementation in the beginning part.

   - Predictions: Visual predictions of several models can be accessed.

   - Datasets: attributes (`names` and `colors`) are in folder `dataset` and some sample lists can be accessed.

   - Some FAQs: [FAQ.md](./FAQ.md).

   - Former video predictions: high accuracy -- [PSPNet](https://youtu.be/rB1BmBOkKTw), [PSANet](https://youtu.be/l5xu1DI6pDk); high efficiency -- [ICNet](https://youtu.be/qWl9idsCuLQ).

### Performance

Description: **mIoU/mAcc/aAcc** stands for mean IoU, mean accuracy of each class and all pixel accuracy respectively. **ss** denotes single scale testing and **ms** indicates multi-scale testing. Training time is measured on a sever with 8 GeForce RTX 2080 Ti. General parameters cross different datasets are listed below:

- Train Parameters: sync_bn(True), scale_min(0.5), scale_max(2.0), rotate_min(-10), rotate_max(10), zoom_factor(8), ignore_label(255), aux_weight(0.4), batch_size(16), base_lr(1e-2), power(0.9), momentum(0.9), weight_decay(1e-4).

- Test Parameters: ignore_label(255), scales(single: [1.0], multiple: [0.5 0.75 1.0 1.25 1.5 1.75]).

1. **ADE20K**:

   Train Parameters: classes(150), train_h(473/465-PSP/A), train_w(473/465-PSP/A), epochs(100).

   Test Parameters: classes(150), test_h(473/465-PSP/A), test_w(473/465-PSP/A), base_size(512).

   - Setting: train on **train** (20210 images) set and test on **val** (2000 images) set.

   |  Network  |  mIoU/mAcc/aAcc(ss)   |  mIoU/mAcc/pAcc(ms)   | Training Time |

   | :-------: | :-------------------: | :-------------------: | :-----------: |

   | PSPNet50  | 0.4189/0.5227/0.8039. | 0.4284/0.5266/0.8106. |      14h      |

   | PSANet50  | 0.4229/0.5307/0.8032. | 0.4305/0.5312/0.8101. |      14h      |

   | PSPNet101 | 0.4310/0.5375/0.8107. | 0.4415/0.5426/0.8172. |      20h      |

   | PSANet101 | 0.4337/0.5385/0.8102. | 0.4414/0.5392/0.8170. |      20h      |

2. **PSACAL VOC 2012**:

   Train Parameters: classes(21), train_h(473/465-PSP/A), train_w(473/465-PSP/A), epochs(50).

   Test Parameters: classes(21), test_h(473/465-PSP/A), test_w(473/465-PSP/A), base_size(512).

   - Setting: train on **train_aug** (10582 images) set and test on **val** (1449 images) set.

   |  Network  |  mIoU/mAcc/aAcc(ss)   |  mIoU/mAcc/pAcc(ms)   | Training Time |

   | :-------: | :-------------------: | :-------------------: | :-----------: |

   | PSPNet50  | 0.7705/0.8513/0.9489. | 0.7802/0.8580/0.9513. |     3.3h      |

   | PSANet50  | 0.7725/0.8569/0.9491. | 0.7787/0.8606/0.9508. |     3.3h      |

   | PSPNet101 | 0.7907/0.8636/0.9534. | 0.7963/0.8677/0.9550. |      5h       |

   | PSANet101 | 0.7870/0.8642/0.9528. | 0.7966/0.8696/0.9549. |      5h       |

3. **Cityscapes**:

   Train Parameters: classes(19), train_h(713/709-PSP/A), train_w(713/709-PSP/A), epochs(200).

   Test Parameters: classes(19), test_h(713/709-PSP/A), test_w(713/709-PSP/A), base_size(2048).

   - Setting: train on **fine_train** (2975 images) set and test on **fine_val** (500 images) set.

   |  Network  |  mIoU/mAcc/aAcc(ss)   |  mIoU/mAcc/pAcc(ms)   | Training Time |

   | :-------: | :-------------------: | :-------------------: | :-----------: |

   | PSPNet50  | 0.7730/0.8431/0.9597. | 0.7838/0.8486/0.9617. |      7h       |

   | PSANet50  | 0.7745/0.8461/0.9600. | 0.7818/0.8487/0.9622. |     7.5h      |

   | PSPNet101 | 0.7863/0.8577/0.9614. | 0.7929/0.8591/0.9638. |      10h      |

   | PSANet101 | 0.7842/0.8599/0.9621. | 0.7940/0.8631/0.9644. |     10.5h     |

### Citation

If you find the code or trained models useful, please consider citing:

```

@misc{semseg2019,

  author={Zhao, Hengshuang},

  title={semseg},

  howpublished={\url{https://github.com/hszhao/semseg}},

  year={2019}

}

@inproceedings{zhao2017pspnet,

  title={Pyramid Scene Parsing Network},

  author={Zhao, Hengshuang and Shi, Jianping and Qi, Xiaojuan and Wang, Xiaogang and Jia, Jiaya},

  booktitle={CVPR},

  year={2017}

}

@inproceedings{zhao2018psanet,

  title={{PSANet}: Point-wise Spatial Attention Network for Scene Parsing},

  author={Zhao, Hengshuang and Zhang, Yi and Liu, Shu and Shi, Jianping and Loy, Chen Change and Lin, Dahua and Jia, Jiaya},

  booktitle={ECCV},

  year={2018}

}

```

### Question

Some [FAQ.md](./FAQ.md) collected. You are welcome to send pull requests or give some advices. Contact information: `hengshuangzhao at gmail.com`.