{"id":13578065,"url":"https://github.com/visinf/1-stage-wseg","last_synced_at":"2025-04-05T15:32:20.715Z","repository":{"id":43913716,"uuid":"249653582","full_name":"visinf/1-stage-wseg","owner":"visinf","description":"Single-Stage Semantic Segmentation from Image Labels (CVPR 2020)","archived":false,"fork":false,"pushed_at":"2021-11-10T17:17:34.000Z","size":7086,"stargazers_count":379,"open_issues_count":6,"forks_count":43,"subscribers_count":21,"default_branch":"master","last_synced_at":"2024-11-05T15:49:07.216Z","etag":null,"topics":["cvpr2020","semantic-segmentation","weakly-supervised-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/visinf.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-03-24T08:33:52.000Z","updated_at":"2024-09-26T08:22:21.000Z","dependencies_parsed_at":"2022-09-08T23:11:18.352Z","dependency_job_id":null,"html_url":"https://github.com/visinf/1-stage-wseg","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/visinf%2F1-stage-wseg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/visinf%2F1-stage-wseg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/visinf%2F1-stage-wseg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/visinf%2F1-stage-wseg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/visinf","download_url":"https://codeload.github.com/visinf/1-stage-wseg/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247359385,"owners_count":20926413,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cvpr2020","semantic-segmentation","weakly-supervised-learning"],"created_at":"2024-08-01T15:01:26.897Z","updated_at":"2025-04-05T15:32:15.707Z","avatar_url":"https://github.com/visinf.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Single-Stage Semantic Segmentation from Image Labels\n\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Framework](https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?\u0026logo=PyTorch\u0026logoColor=white)](https://pytorch.org/)\n\nThis repository contains the original implementation of our paper:\n\n\n**Single-stage Semantic Segmentation from Image Labels**\u003cbr\u003e\n*[Nikita Araslanov](https://arnike.github.io) and [Stefan Roth](https://www.visinf.tu-darmstadt.de/team_members/sroth/sroth.en.jsp)*\u003cbr\u003e\nCVPR 2020. [[pdf](https://openaccess.thecvf.com/content_CVPR_2020/papers/Araslanov_Single-Stage_Semantic_Segmentation_From_Image_Labels_CVPR_2020_paper.pdf)] [[supp](https://openaccess.thecvf.com/content_CVPR_2020/supplemental/Araslanov_Single-Stage_Semantic_Segmentation_CVPR_2020_supplemental.pdf)]\n[[arXiv](https://arxiv.org/abs/2005.08104)]\n\nContact: Nikita Araslanov \u003cfname.lname@visinf.tu-darmstadt.de\u003e\n\n\n| \u003cimg src=\"figures/results.gif\" alt=\"drawing\" width=\"480\"/\u003e\u003cbr\u003e |\n|:---|\n| We attain competitive results by training a single network model \u003cbr\u003e for segmentation in a self-supervised fashion using only \u003cbr\u003e image-level annotations (one run of 20 epochs on Pascal VOC). |\n\n### Setup\n0. **Minimum requirements.** This project was originally developed with Python 3.6, PyTorch 1.0 and CUDA 9.0. The training requires at least two Titan X GPUs (12Gb memory each).\n1. **Setup your Python environment.** Please, clone the repository and install the dependencies. We recommend using Anaconda 3 distribution:\n    ```\n    conda create -n \u003cenvironment_name\u003e --file requirements.txt\n    ```\n2. **Download and link to the dataset.** We train our model on the original Pascal VOC 2012 augmented with the SBD data (10K images in total). Download the data from:\n    - VOC: [Training/Validation (2GB .tar file)](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar)\n    - SBD: [Training (1.4GB .tgz file)](http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz)\n\n    Link to the data:\n    ```\n    ln -s \u003cyour_path_to_voc\u003e \u003cproject\u003e/data/voc\n    ln -s \u003cyour_path_to_sbd\u003e \u003cproject\u003e/data/sbd\n    ```\n    Make sure that the first directory in `data/voc` is `VOCdevkit`; the first directory in `data/sbd` is `benchmark_RELEASE`.\n3. **Download pre-trained models.** Download the initial weights (pre-trained on ImageNet) for the backbones you are planning to use and place them into `\u003cproject\u003e/models/weights/`.\n\n    | Backbone | Initial Weights | Comment |\n    |:---:|:---:|:---:|\n    | WideResNet38 | [ilsvrc-cls_rna-a1_cls1000_ep-0001.pth (402M)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/models/ilsvrc-cls_rna-a1_cls1000_ep-0001.pth) | Converted from [mxnet](https://github.com/itijyou/ademxapp) |\n    | VGG16 | [vgg16_20M.pth (79M)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/models/vgg16_20M.pth) | Converted from [Caffe](http://liangchiehchen.com/projects/Init%20Models.html) |\n    | ResNet50 | [resnet50-19c8e357.pth](https://download.pytorch.org/models/resnet50-19c8e357.pth) | PyTorch official |\n    | ResNet101 | [resnet101-5d3b4d8f.pth](https://download.pytorch.org/models/resnet101-5d3b4d8f.pth) | PyTorch official |\n\n\n### Training, Inference and Evaluation\nThe directory `launch` contains template bash scripts for training, inference and evaluation. \n\n**Training.** For each run, you need to specify names of two variables, for example\n```bash\nEXP=baselines\nRUN_ID=v01\n```\nRunning `bash ./launch/run_voc_resnet38.sh` will create a directory `./logs/pascal_voc/baselines/v01` with tensorboard events and will save snapshots into `./snapshots/pascal_voc/baselines/v01`.\n\n**Inference.** To generate final masks, please, use the script `./launch/infer_val.sh`. You will need to specify:\n* `EXP` and `RUN_ID` you used for training;\n* `OUTPUT_DIR` the path where to save the masks;\n* `FILELIST` specifies the file to the data split;\n* `SNAPSHOT` specifies the model suffix in the format `e000Xs0.000`. For example, `e020Xs0.928`;\n* (optionally) `EXTRA_ARGS` specify additional arguments to the inference script.\n\n**Evaluation.** To compute IoU of the masks, please, run `./launch/eval_seg.sh`. You will need to specify `SAVE_DIR` that contains the masks and `FILELIST` specifying the split for evaluation.\n\n### Pre-trained model\nFor testing, we provide our pre-trained WideResNet38 model:\n\n| Backbone | Val | Val (+ CRF) | Link |\n|:---:|:---:|:---:|---:|\n| WideResNet38 | 59.7 | 62.7 | [model_enc_e020Xs0.928.pth (527M)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/models/model_enc_e020Xs0.928.pth) |\n\nThe also release the masks predicted by this model:\n\n| Split | IoU | IoU (+ CRF) | Link | Comment |\n|:---:|:---:|:---:|:---:|:---:|\n| train-clean (VOC+SBD) | 64.7 | 66.9 | [train_results_clean.tgz (2.9G)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/results/train_results_clean.tgz) | Reported IoU  is for VOC |\n| val-clean | 63.4 | 65.3 | [val_results_clean.tgz (423M)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/results/val_results_clean.tgz)  | |\n| val | 59.7 | 62.7 | [val_results.tgz (427M)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/results/val_results.tgz) | |\n| test | 62.7 | 64.3 | [test_results.tgz (368M)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/results/test_results.tgz) | |\n\nThe suffix `-clean` means we used ground-truth image-level labels to remove masks of the categories not present in the image.\nThese masks are commonly used as pseudo ground truth to train another segmentation model in fully supervised regime.\n\n## Acknowledgements\nWe thank PyTorch team, and Jiwoon Ahn for releasing his [code](https://github.com/jiwoon-ahn/psa) that helped in the early stages of this project.\n\n## Citation\nWe hope that you find this work useful. If you would like to acknowledge us, please, use the following citation:\n```\n@InProceedings{Araslanov:2020:SSS,\nauthor = {Araslanov, Nikita and Roth, Stefan},\ntitle = {Single-Stage Semantic Segmentation From Image Labels},\nbooktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},\nmonth = {June},\npages = {4253--4262}\nyear = {2020}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvisinf%2F1-stage-wseg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvisinf%2F1-stage-wseg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvisinf%2F1-stage-wseg/lists"}