{"id":13856880,"url":"https://github.com/hszhao/semseg","last_synced_at":"2025-05-16T07:03:55.424Z","repository":{"id":40650301,"uuid":"147709096","full_name":"hszhao/semseg","owner":"hszhao","description":"Semantic Segmentation in Pytorch","archived":false,"fork":false,"pushed_at":"2022-08-28T10:50:55.000Z","size":1458,"stargazers_count":1355,"open_issues_count":45,"forks_count":245,"subscribers_count":21,"default_branch":"master","last_synced_at":"2025-04-08T16:08:08.734Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hszhao.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-09-06T17:26:36.000Z","updated_at":"2025-04-06T22:36:25.000Z","dependencies_parsed_at":"2022-07-14T04:50:29.937Z","dependency_job_id":null,"html_url":"https://github.com/hszhao/semseg","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hszhao%2Fsemseg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hszhao%2Fsemseg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hszhao%2Fsemseg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hszhao%2Fsemseg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hszhao","download_url":"https://codeload.github.com/hszhao/semseg/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254485053,"owners_count":22078767,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-05T03:01:17.233Z","updated_at":"2025-05-16T07:03:55.403Z","avatar_url":"https://github.com/hszhao.png","language":"Python","funding_links":[],"categories":["Toolbox","Python"],"sub_categories":["Libraries"],"readme":"# PyTorch Semantic Segmentation\n\n### Introduction\n\nThis repository is a PyTorch implementation for semantic segmentation / scene parsing. The code is easy to use for training and testing on various datasets. The codebase mainly uses ResNet50/101/152 as backbone and can be easily adapted to other basic classification structures. Implemented networks including [PSPNet](https://hszhao.github.io/projects/pspnet) and [PSANet](https://hszhao.github.io/projects/psanet), which ranked 1st places in [ImageNet Scene Parsing Challenge 2016 @ECCV16](http://image-net.org/challenges/LSVRC/2016/results), [LSUN Semantic Segmentation Challenge 2017 @CVPR17](https://blog.mapillary.com/product/2017/06/13/lsun-challenge.html) and [WAD Drivable Area Segmentation Challenge 2018 @CVPR18](https://bdd-data.berkeley.edu/wad-2018.html). Sample experimented datasets are [ADE20K](http://sceneparsing.csail.mit.edu), [PASCAL VOC 2012](http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11\u0026compid=6) and [Cityscapes](https://www.cityscapes-dataset.com).\n\n\u003cimg src=\"./figure/pspnet.png\" width=\"900\"/\u003e\n\n### Update\n\n- 2020.05.15: Branch `master`, use official [nn.SyncBatchNorm](https://pytorch.org/docs/master/nn.html#torch.nn.SyncBatchNorm), only multiprocessing training is supported, tested with pytorch 1.4.0.\n- 2019.05.29: Branch `1.0.0`, both multithreading training ([nn.DataParallel](https://pytorch.org/docs/stable/nn.html#dataparallel)) and multiprocessing training ([nn.parallel.DistributedDataParallel](https://pytorch.org/docs/stable/_modules/torch/nn/parallel/distributed.html)) (**recommended**) are supported. And the later one is much faster. Use `syncbn` from [EncNet](https://github.com/zhanghang1989/PyTorch-Encoding) and [apex](https://github.com/NVIDIA/apex), tested with pytorch 1.0.0.\n\n### Usage\n\n1. Highlight:\n\n   - Fast multiprocessing training ([nn.parallel.DistributedDataParallel](https://pytorch.org/docs/stable/_modules/torch/nn/parallel/distributed.html)) with official [nn.SyncBatchNorm](https://pytorch.org/docs/master/nn.html#torch.nn.SyncBatchNorm).\n   - Better reimplementation results with well designed code structures.\n   - All initialization models, trained models and predictions are [available](https://drive.google.com/open?id=15wx9vOM0euyizq-M1uINgN0_wjVRf9J3).\n\n2. Requirement:\n\n   - Hardware: 4-8 GPUs (better with \u003e=11G GPU memory)\n   - Software: PyTorch\u003e=1.1.0, Python3, [tensorboardX](https://github.com/lanpa/tensorboardX), \n\n3. Clone the repository:\n\n   ```shell\n   git clone https://github.com/hszhao/semseg.git\n   ```\n\n4. Train:\n\n   - Download related datasets and symlink the paths to them as follows (you can alternatively modify the relevant paths specified in folder `config`):\n\n     ```\n     cd semseg\n     mkdir -p dataset\n     ln -s /path_to_ade20k_dataset dataset/ade20k\n     ```\n\n   - Download ImageNet pre-trained [models]((https://drive.google.com/open?id=15wx9vOM0euyizq-M1uINgN0_wjVRf9J3)) and put them under folder `initmodel` for weight initialization. Remember to use the right dataset format detailed in [FAQ.md](./FAQ.md).\n\n   - Specify the gpu used in config then do training:\n\n     ```shell\n     sh tool/train.sh ade20k pspnet50\n     ```\n   - If you are using [SLURM](https://slurm.schedmd.com/documentation.html) for nodes manager, uncomment lines in train.sh and then do training:\n\n     ```shell\n     sbatch tool/train.sh ade20k pspnet50\n     ```\n\n5. Test:\n\n   - Download trained segmentation models and put them under folder specified in config or modify the specified paths.\n\n   - For full testing (get listed performance):\n\n     ```shell\n     sh tool/test.sh ade20k pspnet50\n     ```\n\n   - **Quick demo** on one image:\n\n     ```shell\n     PYTHONPATH=./ python tool/demo.py --config=config/ade20k/ade20k_pspnet50.yaml --image=figure/demo/ADE_val_00001515.jpg TEST.scales '[1.0]'\n     ```\n\n6. Visualization: [tensorboardX](https://github.com/lanpa/tensorboardX) incorporated for better visualization.\n\n   ```shell\n   tensorboard --logdir=exp/ade20k\n   ```\n\n7. Other:\n\n   - Resources: GoogleDrive [LINK](https://drive.google.com/open?id=15wx9vOM0euyizq-M1uINgN0_wjVRf9J3) contains shared models, visual predictions and data lists.\n   - Models: ImageNet pre-trained models and trained segmentation models can be accessed. Note that our ImageNet pretrained models are slightly different from original [ResNet](https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py) implementation in the beginning part.\n   - Predictions: Visual predictions of several models can be accessed.\n   - Datasets: attributes (`names` and `colors`) are in folder `dataset` and some sample lists can be accessed.\n   - Some FAQs: [FAQ.md](./FAQ.md).\n   - Former video predictions: high accuracy -- [PSPNet](https://youtu.be/rB1BmBOkKTw), [PSANet](https://youtu.be/l5xu1DI6pDk); high efficiency -- [ICNet](https://youtu.be/qWl9idsCuLQ).\n\n### Performance\n\nDescription: **mIoU/mAcc/aAcc** stands for mean IoU, mean accuracy of each class and all pixel accuracy respectively. **ss** denotes single scale testing and **ms** indicates multi-scale testing. Training time is measured on a sever with 8 GeForce RTX 2080 Ti. General parameters cross different datasets are listed below:\n\n- Train Parameters: sync_bn(True), scale_min(0.5), scale_max(2.0), rotate_min(-10), rotate_max(10), zoom_factor(8), ignore_label(255), aux_weight(0.4), batch_size(16), base_lr(1e-2), power(0.9), momentum(0.9), weight_decay(1e-4).\n- Test Parameters: ignore_label(255), scales(single: [1.0], multiple: [0.5 0.75 1.0 1.25 1.5 1.75]).\n\n1. **ADE20K**:\n   Train Parameters: classes(150), train_h(473/465-PSP/A), train_w(473/465-PSP/A), epochs(100).\n   Test Parameters: classes(150), test_h(473/465-PSP/A), test_w(473/465-PSP/A), base_size(512).\n\n   - Setting: train on **train** (20210 images) set and test on **val** (2000 images) set.\n\n   |  Network  |  mIoU/mAcc/aAcc(ss)   |  mIoU/mAcc/pAcc(ms)   | Training Time |\n   | :-------: | :-------------------: | :-------------------: | :-----------: |\n   | PSPNet50  | 0.4189/0.5227/0.8039. | 0.4284/0.5266/0.8106. |      14h      |\n   | PSANet50  | 0.4229/0.5307/0.8032. | 0.4305/0.5312/0.8101. |      14h      |\n   | PSPNet101 | 0.4310/0.5375/0.8107. | 0.4415/0.5426/0.8172. |      20h      |\n   | PSANet101 | 0.4337/0.5385/0.8102. | 0.4414/0.5392/0.8170. |      20h      |\n\n2. **PSACAL VOC 2012**:\n   Train Parameters: classes(21), train_h(473/465-PSP/A), train_w(473/465-PSP/A), epochs(50).\n   Test Parameters: classes(21), test_h(473/465-PSP/A), test_w(473/465-PSP/A), base_size(512).\n\n   - Setting: train on **train_aug** (10582 images) set and test on **val** (1449 images) set.\n\n   |  Network  |  mIoU/mAcc/aAcc(ss)   |  mIoU/mAcc/pAcc(ms)   | Training Time |\n   | :-------: | :-------------------: | :-------------------: | :-----------: |\n   | PSPNet50  | 0.7705/0.8513/0.9489. | 0.7802/0.8580/0.9513. |     3.3h      |\n   | PSANet50  | 0.7725/0.8569/0.9491. | 0.7787/0.8606/0.9508. |     3.3h      |\n   | PSPNet101 | 0.7907/0.8636/0.9534. | 0.7963/0.8677/0.9550. |      5h       |\n   | PSANet101 | 0.7870/0.8642/0.9528. | 0.7966/0.8696/0.9549. |      5h       |\n\n3. **Cityscapes**:\n   Train Parameters: classes(19), train_h(713/709-PSP/A), train_w(713/709-PSP/A), epochs(200).\n   Test Parameters: classes(19), test_h(713/709-PSP/A), test_w(713/709-PSP/A), base_size(2048).\n\n   - Setting: train on **fine_train** (2975 images) set and test on **fine_val** (500 images) set.\n\n   |  Network  |  mIoU/mAcc/aAcc(ss)   |  mIoU/mAcc/pAcc(ms)   | Training Time |\n   | :-------: | :-------------------: | :-------------------: | :-----------: |\n   | PSPNet50  | 0.7730/0.8431/0.9597. | 0.7838/0.8486/0.9617. |      7h       |\n   | PSANet50  | 0.7745/0.8461/0.9600. | 0.7818/0.8487/0.9622. |     7.5h      |\n   | PSPNet101 | 0.7863/0.8577/0.9614. | 0.7929/0.8591/0.9638. |      10h      |\n   | PSANet101 | 0.7842/0.8599/0.9621. | 0.7940/0.8631/0.9644. |     10.5h     |\n\n### Citation\n\nIf you find the code or trained models useful, please consider citing:\n\n```\n@misc{semseg2019,\n  author={Zhao, Hengshuang},\n  title={semseg},\n  howpublished={\\url{https://github.com/hszhao/semseg}},\n  year={2019}\n}\n@inproceedings{zhao2017pspnet,\n  title={Pyramid Scene Parsing Network},\n  author={Zhao, Hengshuang and Shi, Jianping and Qi, Xiaojuan and Wang, Xiaogang and Jia, Jiaya},\n  booktitle={CVPR},\n  year={2017}\n}\n@inproceedings{zhao2018psanet,\n  title={{PSANet}: Point-wise Spatial Attention Network for Scene Parsing},\n  author={Zhao, Hengshuang and Zhang, Yi and Liu, Shu and Shi, Jianping and Loy, Chen Change and Lin, Dahua and Jia, Jiaya},\n  booktitle={ECCV},\n  year={2018}\n}\n```\n\n### Question\n\nSome [FAQ.md](./FAQ.md) collected. You are welcome to send pull requests or give some advices. Contact information: `hengshuangzhao at gmail.com`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhszhao%2Fsemseg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhszhao%2Fsemseg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhszhao%2Fsemseg/lists"}