https://github.com/visinf/1-stage-wseg

Single-Stage Semantic Segmentation from Image Labels (CVPR 2020)
https://github.com/visinf/1-stage-wseg

cvpr2020 semantic-segmentation weakly-supervised-learning

Last synced: 17 days ago
JSON representation

Single-Stage Semantic Segmentation from Image Labels (CVPR 2020)

Host: GitHub
URL: https://github.com/visinf/1-stage-wseg
Owner: visinf
License: apache-2.0
Created: 2020-03-24T08:33:52.000Z (about 5 years ago)
Default Branch: master
Last Pushed: 2021-11-10T17:17:34.000Z (over 3 years ago)
Last Synced: 2024-11-05T15:49:07.216Z (6 months ago)
Topics: cvpr2020, semantic-segmentation, weakly-supervised-learning
Language: Python
Homepage:
Size: 6.76 MB
Stars: 379
Watchers: 21
Forks: 43
Open Issues: 6
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Single-Stage Semantic Segmentation from Image Labels

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

[![Framework](https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?&logo=PyTorch&logoColor=white)](https://pytorch.org/)

This repository contains the original implementation of our paper:

**Single-stage Semantic Segmentation from Image Labels**


*[Nikita Araslanov](https://arnike.github.io) and [Stefan Roth](https://www.visinf.tu-darmstadt.de/team_members/sroth/sroth.en.jsp)*


CVPR 2020. [[pdf](https://openaccess.thecvf.com/content_CVPR_2020/papers/Araslanov_Single-Stage_Semantic_Segmentation_From_Image_Labels_CVPR_2020_paper.pdf)] [[supp](https://openaccess.thecvf.com/content_CVPR_2020/supplemental/Araslanov_Single-Stage_Semantic_Segmentation_CVPR_2020_supplemental.pdf)]

[[arXiv](https://arxiv.org/abs/2005.08104)]

Contact: Nikita Araslanov 

| 
 |

|:---|

| We attain competitive results by training a single network model 
 for segmentation in a self-supervised fashion using only 
 image-level annotations (one run of 20 epochs on Pascal VOC). |

### Setup

0. **Minimum requirements.** This project was originally developed with Python 3.6, PyTorch 1.0 and CUDA 9.0. The training requires at least two Titan X GPUs (12Gb memory each).

1. **Setup your Python environment.** Please, clone the repository and install the dependencies. We recommend using Anaconda 3 distribution:

    ```

    conda create -n  --file requirements.txt

    ```

2. **Download and link to the dataset.** We train our model on the original Pascal VOC 2012 augmented with the SBD data (10K images in total). Download the data from:

    - VOC: [Training/Validation (2GB .tar file)](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar)

    - SBD: [Training (1.4GB .tgz file)](http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz)

    Link to the data:

    ```

    ln -s  /data/voc

    ln -s  /data/sbd

    ```

    Make sure that the first directory in `data/voc` is `VOCdevkit`; the first directory in `data/sbd` is `benchmark_RELEASE`.

3. **Download pre-trained models.** Download the initial weights (pre-trained on ImageNet) for the backbones you are planning to use and place them into `/models/weights/`.

    | Backbone | Initial Weights | Comment |

    |:---:|:---:|:---:|

    | WideResNet38 | [ilsvrc-cls_rna-a1_cls1000_ep-0001.pth (402M)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/models/ilsvrc-cls_rna-a1_cls1000_ep-0001.pth) | Converted from [mxnet](https://github.com/itijyou/ademxapp) |

    | VGG16 | [vgg16_20M.pth (79M)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/models/vgg16_20M.pth) | Converted from [Caffe](http://liangchiehchen.com/projects/Init%20Models.html) |

    | ResNet50 | [resnet50-19c8e357.pth](https://download.pytorch.org/models/resnet50-19c8e357.pth) | PyTorch official |

    | ResNet101 | [resnet101-5d3b4d8f.pth](https://download.pytorch.org/models/resnet101-5d3b4d8f.pth) | PyTorch official |

### Training, Inference and Evaluation

The directory `launch` contains template bash scripts for training, inference and evaluation. 

**Training.** For each run, you need to specify names of two variables, for example

```bash

EXP=baselines

RUN_ID=v01

```

Running `bash ./launch/run_voc_resnet38.sh` will create a directory `./logs/pascal_voc/baselines/v01` with tensorboard events and will save snapshots into `./snapshots/pascal_voc/baselines/v01`.

**Inference.** To generate final masks, please, use the script `./launch/infer_val.sh`. You will need to specify:

* `EXP` and `RUN_ID` you used for training;

* `OUTPUT_DIR` the path where to save the masks;

* `FILELIST` specifies the file to the data split;

* `SNAPSHOT` specifies the model suffix in the format `e000Xs0.000`. For example, `e020Xs0.928`;

* (optionally) `EXTRA_ARGS` specify additional arguments to the inference script.

**Evaluation.** To compute IoU of the masks, please, run `./launch/eval_seg.sh`. You will need to specify `SAVE_DIR` that contains the masks and `FILELIST` specifying the split for evaluation.

### Pre-trained model

For testing, we provide our pre-trained WideResNet38 model:

| Backbone | Val | Val (+ CRF) | Link |

|:---:|:---:|:---:|---:|

| WideResNet38 | 59.7 | 62.7 | [model_enc_e020Xs0.928.pth (527M)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/models/model_enc_e020Xs0.928.pth) |

The also release the masks predicted by this model:

| Split | IoU | IoU (+ CRF) | Link | Comment |

|:---:|:---:|:---:|:---:|:---:|

| train-clean (VOC+SBD) | 64.7 | 66.9 | [train_results_clean.tgz (2.9G)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/results/train_results_clean.tgz) | Reported IoU  is for VOC |

| val-clean | 63.4 | 65.3 | [val_results_clean.tgz (423M)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/results/val_results_clean.tgz)  | |

| val | 59.7 | 62.7 | [val_results.tgz (427M)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/results/val_results.tgz) | |

| test | 62.7 | 64.3 | [test_results.tgz (368M)](https://download.visinf.tu-darmstadt.de/data/2020-cvpr-araslanov-1-stage-wseg/results/test_results.tgz) | |

The suffix `-clean` means we used ground-truth image-level labels to remove masks of the categories not present in the image.

These masks are commonly used as pseudo ground truth to train another segmentation model in fully supervised regime.

## Acknowledgements

We thank PyTorch team, and Jiwoon Ahn for releasing his [code](https://github.com/jiwoon-ahn/psa) that helped in the early stages of this project.

## Citation

We hope that you find this work useful. If you would like to acknowledge us, please, use the following citation:

```

@InProceedings{Araslanov:2020:SSS,

author = {Araslanov, Nikita and Roth, Stefan},

title = {Single-Stage Semantic Segmentation From Image Labels},

booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},

month = {June},

pages = {4253--4262}

year = {2020}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/visinf/1-stage-wseg

Awesome Lists containing this project

README