{"id":18614562,"url":"https://github.com/minar09/drn-pytorch","last_synced_at":"2025-04-11T00:30:50.993Z","repository":{"id":67769952,"uuid":"163288008","full_name":"minar09/DRN-PyTorch","owner":"minar09","description":"PyTorch implementation of Dilated Residual Networks for semantic image segmentation","archived":false,"fork":false,"pushed_at":"2018-12-31T03:44:24.000Z","size":412,"stargazers_count":4,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-25T06:51:12.482Z","etag":null,"topics":["dilated","dilated-convolution","dilated-resnet","drn","pytorch","semantic-segmentation"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/minar09.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-12-27T11:45:06.000Z","updated_at":"2023-08-22T14:18:34.000Z","dependencies_parsed_at":null,"dependency_job_id":"c52468bb-ad51-44f4-9fa1-cf902b04e61d","html_url":"https://github.com/minar09/DRN-PyTorch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minar09%2FDRN-PyTorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minar09%2FDRN-PyTorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minar09%2FDRN-PyTorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minar09%2FDRN-PyTorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/minar09","download_url":"https://codeload.github.com/minar09/DRN-PyTorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248322227,"owners_count":21084333,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dilated","dilated-convolution","dilated-resnet","drn","pytorch","semantic-segmentation"],"created_at":"2024-11-07T03:26:03.551Z","updated_at":"2025-04-11T00:30:50.980Z","avatar_url":"https://github.com/minar09.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Overview\n\nThis code provides various models combining dilated convolutions with residual networks. Our models can achieve better performance with less parameters than ResNet on [image classification](#image-classification) and [semantic segmentation](#semantic-image-segmentataion).\n\nIf you find this code useful for your publications, please consider citing\n\n```\n@inproceedings{Yu2017,\n    title     = {Dilated Residual Networks},\n    author    = {Fisher Yu and Vladlen Koltun and Thomas Funkhouser},\n    booktitle = {Computer Vision and Pattern Recognition (CVPR)},\n    year      = {2017},\n}\n\n@inproceedings{Yu2016,\n    title     = {Multi-scale context aggregation by dilated convolutions},\n    author    = {Yu, Fisher and Koltun, Vladlen},\n    booktitle = {International Conference on Learning Representations (ICLR)},\n    year      = {2016}\n}\n```\n\n## Code Highlights\n\n- The pretrained model can be loaded using Pytorch model zoo api. [Example here](https://github.com/fyu/drn/blob/master/drn.py#L264).\n- Pytorch based image classification and semantic image segmentation.\n- BatchNorm synchronization across multipe GPUs.\n- High-resolution class activiation maps for state-of-the-art weakly supervised object localization.\n- [DRN-D-105](#semantic-image-segmentataion) gets 76.3% mIoU on Cityscapes with only fine training annotation and no context module.\n\n## Image Classification\n\nImage classification is meant to be a controlled study to understand the role of high resolution feature maps in image classification and the class activations rising from it. Based on the investigation, we are able to design more efficient networks for learning high-resolution image representation. They have practical usage in semantic image segmentation, as detailed in [image segmentation section](#semantic-image-segmentataion).\n\n### Models\n\nComparison of classification error rate on ImageNet validation set and numbers of parameters. It is evaluated on single center 224x224 crop from resized images whose shorter side is 256-pixel long.\n\n| Name | Top-1 | Top-5 | Params |\n| --- | :---: | :---: | :---: |\n| ResNet-18 | 30.4% | 10.8% | 11.7M |\n| DRN-A-18 | 28.0% | 9.5% | 11.7M |\n| DRN-D-22 | 25.8% | 8.2% |16.4M |\n| DRN-C-26 | 24.9% | 7.6% |21.1M |\n| ResNet-34 | 27.7% | 8.7% | 21.8M |\n| DRN-A-34 | 24.8% | 7.5% | 21.8M|\n| DRN-D-38 | 23.8% | 6.9% |26.5M |\n| DRN-C-42 | 22.9% | 6.6% |31.2M |\n| ResNet-50 | 24.0% | 7.0% | 25.6M |\n| DRN-A-50 | 22.9% | 6.6% | 25.6M |\n| DRN-D-54 | 21.2% | 5.9% | 35.8M |\n| DRN-C-58 | 21.7% | 6.0% | 41.6M |\n| ResNet-101 | 22.4% | 6.2% | 44.5M |\n| DRN-D-105 |  20.6% | 5.5% | 54.8M |\n| ResNet-152 | 22.2% | 6.2% | 60.2M |\n\nThe figure below groups the parameter and error rate comparison based on netwok structures.\n\n![comparison](doc/drn_comp.png)\n\n\n### Training and Testing\n\nThe code is written in Python using [Pytorch](https://github.com/pytorch/pytorch). I started with code in [torchvision](https://github.com/pytorch/vision). Please check their license as well if copyright is your concern. Software dependency:\n\n* Python 3\n* Pillow\n* pytorch\n* torchvision\n\n**Note** If you want to train your own semantic segmentation model, make sure your Pytorch version is greater than [0.2.0](https://github.com/pytorch/pytorch/releases) or includes commit [78020a](https://github.com/pytorch/pytorch/pull/2077/commits/78020a52abb76fcb1c344b3c42fbe8610cc387e4).\n\nGo to [this page](https://github.com/facebook/fb.resnet.torch/blob/master/INSTALL.md#download-the-imagenet-dataset) to prepare ImageNet 1K data.\n\nTo test a model on ImageNet validation set:\n```\npython3 classify.py test --arch drn_c_26 -j 4 \u003cimagenet dir\u003e --pretrained\n```\n\nTo train a new model:\n```\npython3 classify.py train --arch drn_c_26 -j 8 \u003cimagenet dir\u003e --epochs 120\n```\n\nBesides `drn_c_26`, we also provide `drn_c_42` and `drn_c_58`. They are in DRN-C family as described in [Dilated Residual Networks](https://umich.app.box.com/v/drn). DRN-D models are simplified versions of DRN-C. Their code names are `drn_d_22`, `drn_d_38`, `drn_d_54`, and `drn_d_105`.\n\n## Semantic Image Segmentataion\n\n### Models\n\nComparison of mIoU on Cityscapes and numbers of parameters.\n\n| Name | mIoU | Params |\n| --- | :---: | :---: |\n| DRN-A-50 | 67.3% | 25.6M |\n| DRN-C-26 | 68.0% | 21.1M |\n| DRN-C-42 | 70.9% | 31.2M |\n| DRN-D-22 | 68.0% | 16.4M |\n| DRN-D-38 | 71.4% | 26.5M |\n| DRN-D-105* | 75.6% | 54.8M |\n\n*trained with poly learning rate, random scaling and rotations.\n\nDRN-D-105 gets 76.3% mIoU on Cityscapes testing set with multi-scale testing, poly learning rate and data augmentation with random rotation and scaling in training. Full results are [here](datasets/cityscapes/drn-d-105.csv).\n\n### Prepare Data\n\nThe segmentation image data folder is supposed to contain following image lists with names below:\n\n* train_images.txt\n* train_labels.txt\n* val_images.txt\n* val_labels.txt\n* test_images.txt\n\nThe code will also look for `info.json` in the folder. It contains mean and std of the training images. For example, below is `info.json` used for training on Cityscapes.\n\n```\n{\n    \"mean\": [\n        0.290101,\n        0.328081,\n        0.286964\n    ],\n    \"std\": [\n        0.182954,\n        0.186566,\n        0.184475\n    ]\n}\n```\n\nEach line in the list is a path to an input image or its label map relative to the segmentation folder.\n\nFor example, if the data folder is \"/foo/bar\" and train_images.txt in it contains\n```\nleftImg8bit/train/aachen/aachen_000000_000019_leftImg8bit.png\nleftImg8bit/train/aachen/aachen_000001_000019_leftImg8bit.png\n```\nand train_labels.txt contrains\n```\ngtFine/train/aachen/aachen_000000_000019_gtFine_trainIds.png\ngtFine/train/aachen/aachen_000001_000019_gtFine_trainIds.png\n```\nThen the first image path is expected at\n```\n/foo/bar/leftImg8bit/train/aachen/aachen_000000_000019_leftImg8bit.png\n```\nand its label map is at\n```\n/foo/bar/gtFine/train/aachen/aachen_000000_000019_gtFine_trainIds.png\n```\n\nIn training phase, both train_\\* and val_\\* are assumed to be in the data folder. In validation phase, only val_images.txt and val_labels.txt are needed. In testing phase, when there are no available labels, only test_images.txt is needed. `segment.py` has a command line option `--phase` and the corresponding acceptable arguments are `train`, `val`, and `test`.\n\nTo set up Cityscapes data, please check this [document](datasets/cityscapes).\n\n### Optimization Setup\n\nThe current segmentation models are trained on basic data augmentation (random crops + flips). The learning rate is changed by steps, where it is decreased by a factor of 10 at each step.\n\n### Training\n\nTo train a new model, use\n```\npython3 segment.py train -d \u003cdata_folder\u003e -c \u003ccategory_number\u003e -s 896 \\\n    --arch drn_d_22 --batch-size 32 --epochs 250 --lr 0.01 --momentum 0.9 \\\n    --step 100\n```\n\n`category_number` is the number of categories in segmentation. It is 19 for Cityscapes and 11 for Camvid. The actual label maps should contain values in the range of `[0, category_number)`. Invalid pixels can be labeled as 255 and they will be ignored in training and evaluation. Depends on the batch size, lr and momentum can be 0.01/0.9 or 0.001/0.99.\n\nIf you want to train drn_d_105 to achieve best results on cityscapes dataset, you need to turn on data augmentation and use poly learning rate:\n\n```\npython3 segment.py train -d \u003cdata_folder\u003e -c 19 -s 840 --arch drn_d_105 --random-scale 2 --random-rotate 10 --batch-size 16 --epochs 500 --lr 0.01 --momentum 0.9 -j 16 --lr-mode poly --bn-sync\n\n```\n\nOur case:\n```\npython segment.py train\n```\n\nNote:\n\n - If you use 8 GPUs for 16 crops per batch, the memory for each GPU is more than 12GB. If you don't have enough GPU memory, you can try smaller batch size or crop size. Smaller crop size usually hurts the performance more.\n - Batch normalization synchronization across multiple GPUs is necessary to train very deep convolutional networks for semantic segmentation. We provide an implementation as a pytorch extenstion in `lib/`. However, it is not for the faint-hearted to build from scratch, although an Makefile is provided. So a built binary library for 64-bit Ubuntu is provided. It is tested on Ubuntu 16.04. Also remember to add `lib/` to your `PYTHONPATH`.\n\n### Testing\n\nEvaluate models on testing set or any images without ground truth labels using our related pretrained model:\n```\npython3 segment.py test -d \u003cdata_folder\u003e -c \u003ccategory_number\u003e --arch drn_d_22 \\\n    --pretrained \u003cmodel_path\u003e --phase test --batch-size 1\n```\n\nYou can download the pretrained DRN models on Cityscapes here: http://go.yf.io/drn-cityscapes-models.\n\nIf you want to evaluate a checkpoint from your own training, use `--resume` instead of `--pretrained`:\n```\npython3 segment.py test -d \u003cdata_folder\u003e -c \u003ccategory_number\u003e --arch drn_d_22 \\\n    --resume \u003cmodel_path\u003e --phase test --batch-size 1\n```\n\nYou can also turn on multi-scale testing for better results by adding `--ms`:\n\n```\npython3 segment.py test -d \u003cdata_folder\u003e -c \u003ccategory_number\u003e --arch drn_d_105 \\\n    --resume \u003cmodel_path\u003e --phase val --batch-size 1 --ms\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fminar09%2Fdrn-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fminar09%2Fdrn-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fminar09%2Fdrn-pytorch/lists"}