{"id":15028116,"url":"https://github.com/vainf/deeplabv3plus-pytorch","last_synced_at":"2025-05-15T11:04:50.622Z","repository":{"id":37095225,"uuid":"173100082","full_name":"VainF/DeepLabV3Plus-Pytorch","owner":"VainF","description":"Pretrained DeepLabv3 and DeepLabv3+ for Pascal VOC \u0026 Cityscapes","archived":false,"fork":false,"pushed_at":"2022-11-15T13:01:22.000Z","size":8479,"stargazers_count":2233,"open_issues_count":93,"forks_count":474,"subscribers_count":18,"default_branch":"master","last_synced_at":"2025-04-07T14:02:08.997Z","etag":null,"topics":["cityscapes","deeplabv3","deeplabv3plus","pascal-voc","pytorch","segmentation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/VainF.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-02-28T11:29:20.000Z","updated_at":"2025-04-07T07:01:06.000Z","dependencies_parsed_at":"2023-01-21T02:48:31.845Z","dependency_job_id":null,"html_url":"https://github.com/VainF/DeepLabV3Plus-Pytorch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VainF%2FDeepLabV3Plus-Pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VainF%2FDeepLabV3Plus-Pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VainF%2FDeepLabV3Plus-Pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VainF%2FDeepLabV3Plus-Pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/VainF","download_url":"https://codeload.github.com/VainF/DeepLabV3Plus-Pytorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248933340,"owners_count":21185460,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cityscapes","deeplabv3","deeplabv3plus","pascal-voc","pytorch","segmentation"],"created_at":"2024-09-24T20:07:39.222Z","updated_at":"2025-04-14T18:10:39.764Z","avatar_url":"https://github.com/VainF.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DeepLabv3Plus-Pytorch\n\nPretrained DeepLabv3, DeepLabv3+ for Pascal VOC \u0026 Cityscapes.\n\n## Quick Start \n\n### 1. Available Architectures\n| DeepLabV3    |  DeepLabV3+        |\n| :---: | :---:     |\n|deeplabv3_resnet50|deeplabv3plus_resnet50|\n|deeplabv3_resnet101|deeplabv3plus_resnet101|\n|deeplabv3_mobilenet|deeplabv3plus_mobilenet ||\n|deeplabv3_hrnetv2_48 | deeplabv3plus_hrnetv2_48 |\n|deeplabv3_hrnetv2_32 | deeplabv3plus_hrnetv2_32 |\n|deeplabv3_xception | deeplabv3plus_xception |\n\nplease refer to [network/modeling.py](https://github.com/VainF/DeepLabV3Plus-Pytorch/blob/master/network/modeling.py) for all model entries.\n\nDownload pretrained models: [Dropbox](https://www.dropbox.com/sh/w3z9z8lqpi8b2w7/AAB0vkl4F5vy6HdIhmRCTKHSa?dl=0), [Tencent Weiyun](https://share.weiyun.com/qqx78Pv5)\n\nNote: The HRNet backbone was contributed by @timothylimyl. A pre-trained backbone is available at [google drive](https://drive.google.com/file/d/1NxCK7Zgn5PmeS7W1jYLt5J9E0RRZ2oyF/view?usp=sharing).\n\n### 2. Load the pretrained model:\n```python\nmodel = network.modeling.__dict__[MODEL_NAME](num_classes=NUM_CLASSES, output_stride=OUTPUT_SRTIDE)\nmodel.load_state_dict( torch.load( PATH_TO_PTH )['model_state']  )\n```\n### 3. Visualize segmentation outputs:\n```python\noutputs = model(images)\npreds = outputs.max(1)[1].detach().cpu().numpy()\ncolorized_preds = val_dst.decode_target(preds).astype('uint8') # To RGB images, (N, H, W, 3), ranged 0~255, numpy array\n# Do whatever you like here with the colorized segmentation maps\ncolorized_preds = Image.fromarray(colorized_preds[0]) # to PIL Image\n```\n\n### 4. Atrous Separable Convolution\n\n**Note**: All pre-trained models in this repo were trained without atrous separable convolution.\n\nAtrous Separable Convolution is supported in this repo. We provide a simple tool ``network.convert_to_separable_conv`` to convert ``nn.Conv2d`` to ``AtrousSeparableConvolution``. **Please run main.py with '--separable_conv' if it is required**. See 'main.py' and 'network/_deeplab.py' for more details. \n\n### 5. Prediction\nSingle image:\n```bash\npython predict.py --input datasets/data/cityscapes/leftImg8bit/train/bremen/bremen_000000_000019_leftImg8bit.png  --dataset cityscapes --model deeplabv3plus_mobilenet --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --save_val_results_to test_results\n```\n\nImage folder:\n```bash\npython predict.py --input datasets/data/cityscapes/leftImg8bit/train/bremen  --dataset cityscapes --model deeplabv3plus_mobilenet --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --save_val_results_to test_results\n```\n\n### 6. New backbones\n\nPlease refer to [this commit (Xception)](https://github.com/VainF/DeepLabV3Plus-Pytorch/commit/c4b51e435e32b0deba5fc7c8ff106293df90590d) for more details about how to add new backbones.\n\n### 7. New datasets\n\nYou can train deeplab models on your own datasets. Your ``torch.utils.data.Dataset`` should provide a decoding method that transforms your predictions to colorized images, just like the [VOC Dataset](https://github.com/VainF/DeepLabV3Plus-Pytorch/blob/bfe01d5fca5b6bb648e162d522eed1a9a8b324cb/datasets/voc.py#L156):\n```python\n\nclass MyDataset(data.Dataset):\n    ...\n    @classmethod\n    def decode_target(cls, mask):\n        \"\"\"decode semantic mask to RGB image\"\"\"\n        return cls.cmap[mask]\n```\n\n\n## Results\n\n### 1. Performance on Pascal VOC2012 Aug (21 classes, 513 x 513)\n\nTraining: 513x513 random crop  \nvalidation: 513x513 center crop\n\n|  Model          | Batch Size  | FLOPs  | train/val OS   |  mIoU        | Dropbox  | Tencent Weiyun  | \n| :--------        | :-------------: | :----:   | :-----------: | :--------: | :--------: | :----:   |\n| DeepLabV3-MobileNet       | 16      |  6.0G      |   16/16  |  0.701     |    [Download](https://www.dropbox.com/s/uhksxwfcim3nkpo/best_deeplabv3_mobilenet_voc_os16.pth?dl=0)       | [Download](https://share.weiyun.com/A4ubD1DD) |\n| DeepLabV3-ResNet50         | 16      |  51.4G     |  16/16   |  0.769     |    [Download](https://www.dropbox.com/s/3eag5ojccwiexkq/best_deeplabv3_resnet50_voc_os16.pth?dl=0) | [Download](https://share.weiyun.com/33eLjnVL) |\n| DeepLabV3-ResNet101         | 16      |  72.1G     |  16/16   |  0.773     |    [Download](https://www.dropbox.com/s/vtenndnsrnh4068/best_deeplabv3_resnet101_voc_os16.pth?dl=0)       | [Download](https://share.weiyun.com/iCkzATAw)  |\n| DeepLabV3Plus-MobileNet   | 16      |  17.0G      |  16/16   |  0.711    |    [Download](https://www.dropbox.com/s/0idrhwz6opaj7q4/best_deeplabv3plus_mobilenet_voc_os16.pth?dl=0)   | [Download](https://share.weiyun.com/djX6MDwM) |\n| DeepLabV3Plus-ResNet50    | 16      |   62.7G     |  16/16   |  0.772     |    [Download](https://www.dropbox.com/s/dgxyd3jkyz24voa/best_deeplabv3plus_resnet50_voc_os16.pth?dl=0)   | [Download](https://share.weiyun.com/uTM4i2jG) |\n| DeepLabV3Plus-ResNet101     | 16      |  83.4G     |  16/16   |  0.783     |    [Download](https://www.dropbox.com/s/bm3hxe7wmakaqc5/best_deeplabv3plus_resnet101_voc_os16.pth?dl=0)   | [Download](https://share.weiyun.com/UNPZr3dk) |\n\n\n### 2. Performance on Cityscapes (19 classes, 1024 x 2048)\n\nTraining: 768x768 random crop  \nvalidation: 1024x2048\n\n|  Model          | Batch Size  | FLOPs  | train/val OS   |  mIoU        | Dropbox  |  Tencent Weiyun  |\n| :--------        | :-------------: | :----:   | :-----------: | :--------: | :--------: |  :----:   |\n| DeepLabV3Plus-MobileNet   | 16      |  135G      |  16/16   |  0.721  |    [Download](https://www.dropbox.com/s/753ojyvsh3vdjol/best_deeplabv3plus_mobilenet_cityscapes_os16.pth?dl=0) | [Download](https://share.weiyun.com/aSKjdpbL) \n| DeepLabV3Plus-ResNet101   | 16      |  N/A      |  16/16   |  0.762  |    [Download](https://drive.google.com/file/d/1t7TC8mxQaFECt4jutdq_NMnWxdm6B-Nb/view?usp=sharing) | N/A |\n\n\n#### Segmentation Results on Pascal VOC2012 (DeepLabv3Plus-MobileNet)\n\n\u003cdiv\u003e\n\u003cimg src=\"samples/1_image.png\"   width=\"20%\"\u003e\n\u003cimg src=\"samples/1_target.png\"  width=\"20%\"\u003e\n\u003cimg src=\"samples/1_pred.png\"    width=\"20%\"\u003e\n\u003cimg src=\"samples/1_overlay.png\" width=\"20%\"\u003e\n\u003c/div\u003e\n\n\u003cdiv\u003e\n\u003cimg src=\"samples/23_image.png\"   width=\"20%\"\u003e\n\u003cimg src=\"samples/23_target.png\"  width=\"20%\"\u003e\n\u003cimg src=\"samples/23_pred.png\"    width=\"20%\"\u003e\n\u003cimg src=\"samples/23_overlay.png\" width=\"20%\"\u003e\n\u003c/div\u003e\n\n\u003cdiv\u003e\n\u003cimg src=\"samples/114_image.png\"   width=\"20%\"\u003e\n\u003cimg src=\"samples/114_target.png\"  width=\"20%\"\u003e\n\u003cimg src=\"samples/114_pred.png\"    width=\"20%\"\u003e\n\u003cimg src=\"samples/114_overlay.png\" width=\"20%\"\u003e\n\u003c/div\u003e\n\n#### Segmentation Results on Cityscapes (DeepLabv3Plus-MobileNet)\n\n\u003cdiv\u003e\n\u003cimg src=\"samples/city_1_target.png\"   width=\"45%\"\u003e\n\u003cimg src=\"samples/city_1_overlay.png\"  width=\"45%\"\u003e\n\u003c/div\u003e\n\n\u003cdiv\u003e\n\u003cimg src=\"samples/city_6_target.png\"   width=\"45%\"\u003e\n\u003cimg src=\"samples/city_6_overlay.png\"  width=\"45%\"\u003e\n\u003c/div\u003e\n\n\n#### Visualization of training\n\n![trainvis](samples/visdom-screenshoot.png)\n\n\n## Pascal VOC\n\n### 1. Requirements\n\n```bash\npip install -r requirements.txt\n```\n\n### 2. Prepare Datasets\n\n#### 2.1 Standard Pascal VOC\nYou can run train.py with \"--download\" option to download and extract pascal voc dataset. The defaut path is './datasets/data':\n\n```\n/datasets\n    /data\n        /VOCdevkit \n            /VOC2012 \n                /SegmentationClass\n                /JPEGImages\n                ...\n            ...\n        /VOCtrainval_11-May-2012.tar\n        ...\n```\n\n#### 2.2  Pascal VOC trainaug (Recommended!!)\n\nSee chapter 4 of [2]\n\n        The original dataset contains 1464 (train), 1449 (val), and 1456 (test) pixel-level annotated images. We augment the dataset by the extra annotations provided by [76], resulting in 10582 (trainaug) training images. The performance is measured in terms of pixel intersection-over-union averaged across the 21 classes (mIOU).\n\n*./datasets/data/train_aug.txt* includes the file names of 10582 trainaug images (val images are excluded). Please to download their labels from [Dropbox](https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip?dl=0) or [Tencent Weiyun](https://share.weiyun.com/5NmJ6Rk). Those labels come from [DrSleep's repo](https://github.com/DrSleep/tensorflow-deeplab-resnet).\n\nExtract trainaug labels (SegmentationClassAug) to the VOC2012 directory.\n\n```\n/datasets\n    /data\n        /VOCdevkit  \n            /VOC2012\n                /SegmentationClass\n                /SegmentationClassAug  # \u003c= the trainaug labels\n                /JPEGImages\n                ...\n            ...\n        /VOCtrainval_11-May-2012.tar\n        ...\n```\n\n### 3. Training on Pascal VOC2012 Aug\n\n#### 3.1 Visualize training (Optional)\n\nStart visdom sever for visualization. Please remove '--enable_vis' if visualization is not needed. \n\n```bash\n# Run visdom server on port 28333\nvisdom -port 28333\n```\n\n#### 3.2 Training with OS=16\n\nRun main.py with *\"--year 2012_aug\"* to train your model on Pascal VOC2012 Aug. You can also parallel your training on 4 GPUs with '--gpu_id 0,1,2,3'\n\n**Note: There is no SyncBN in this repo, so training with *multple GPUs and small batch size* may degrades the performance. See [PyTorch-Encoding](https://hangzhang.org/PyTorch-Encoding/tutorials/syncbn.html) for more details about SyncBN**\n\n```bash\npython main.py --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16\n```\n\n#### 3.3 Continue training\n\nRun main.py with '--continue_training' to restore the state_dict of optimizer and scheduler from YOUR_CKPT.\n\n```bash\npython main.py ... --ckpt YOUR_CKPT --continue_training\n```\n\n#### 3.4. Testing\n\nResults will be saved at ./results.\n\n```bash\npython main.py --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16 --ckpt checkpoints/best_deeplabv3plus_mobilenet_voc_os16.pth --test_only --save_val_results\n```\n\n## Cityscapes\n\n### 1. Download cityscapes and extract it to 'datasets/data/cityscapes'\n\n```\n/datasets\n    /data\n        /cityscapes\n            /gtFine\n            /leftImg8bit\n```\n\n### 2. Train your model on Cityscapes\n\n```bash\npython main.py --model deeplabv3plus_mobilenet --dataset cityscapes --enable_vis --vis_port 28333 --gpu_id 0  --lr 0.1  --crop_size 768 --batch_size 16 --output_stride 16 --data_root ./datasets/data/cityscapes \n```\n\n## Reference\n\n[1] [Rethinking Atrous Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1706.05587)\n\n[2] [Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1802.02611)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvainf%2Fdeeplabv3plus-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvainf%2Fdeeplabv3plus-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvainf%2Fdeeplabv3plus-pytorch/lists"}