https://github.com/ehsanik/segan

SeGAN: Segmenting and Generating the Invisible (https://arxiv.org/pdf/1703.10239.pdf)
https://github.com/ehsanik/segan

computer-vision deep-learning generative-adversarial-network image-generation segan segmentation

Last synced: 9 months ago
JSON representation

SeGAN: Segmenting and Generating the Invisible (https://arxiv.org/pdf/1703.10239.pdf)

Host: GitHub
URL: https://github.com/ehsanik/segan
Owner: ehsanik
License: other
Created: 2018-05-04T21:08:45.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2021-11-17T20:47:23.000Z (over 4 years ago)
Last Synced: 2023-10-20T19:38:34.076Z (over 2 years ago)
Topics: computer-vision, deep-learning, generative-adversarial-network, image-generation, segan, segmentation
Language: Lua
Homepage:
Size: 1020 KB
Stars: 62
Watchers: 5
Forks: 12
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# [SeGAN: Segmenting and Generating the Invisible](https://arxiv.org/abs/1703.10239)
This project is presented as spotlight in CVPR2018.

### Abstract

Humans have strong ability to make inferences about the appearance of the invisible and occluded parts of scenes. For example, when we look at the scene on the left we can make predictions about what is behind the coffee table, and can even complete the sofa based on the visible parts of the sofa, the coffee table, and what we know
in general about sofas and coffee tables and how they occlude each other.

SeGAN can learn to

Generate the appearance of the occluded parts of objects,

Segment the invisible parts of objects,

Although trained on synthetic photo realistic images reliably segment natural images,

By reasoning about occluder-occludee relations infer depth layering.

### Citation

If you find this project useful in your research, please consider citing:

@inproceedings{ehsani2018segan,
title={Segan: Segmenting and generating the invisible},
author={Ehsani, Kiana and Mottaghi, Roozbeh and Farhadi, Ali},
booktitle={CVPR},
year={2018}
}

### Prerequisites

- Using Torch 7 and dependencies from [this repository](https://github.com/torch/distro).
- Linux OS
- NVIDIA GPU + CUDA + CuDNN

### Installation

1. Clone the repository using the command:

git clone https://github.com/ehsanik/SeGAN
cd SeGAN

2. Download the dataset from [here](https://drive.google.com/file/d/1TfrP4Sptm6wPMdrn9MrWghfTNAMTCtlY/view?usp=sharing) and extract it.
3. Make a link to the dataset.

ln -s /PATH/TO/DATASET dyce_data

4. Download pretrained weights from [here](https://drive.google.com/file/d/1cGXaO8rHLOVwuVZOXw3tuDDfNxw2eGbL/view?usp=sharing) and extract it.
5. Make a link to the weights' folder.

ln -s /PATH/TO/WEIGHTS weights

### Dataset

We introduce DYCE, a dataset of synthetic
occluded objects. This is a synthetic dataset with
photo-realistic images and natural configuration of objects
in scenes. All of the images of this dataset are taken in indoor
scenes. The annotations for each image contain the
segmentation mask for the visible and invisible regions of
objects. The images are obtained by taking snapshots from
our 3D synthetic scenes.

##### Statistics

The number of the synthetic scenes that we use is 11,
where we use 7 scenes for training and validation, and 4
scenes for testing. Overall there are 5 living rooms and 6 kitchens, where 2 living rooms and 2 kitchen are used for
testing. On average, each scene contains 60 objects and the
number of visible objects per image is 17.5 (by visible we
mean having at least 10 visible pixels). There is no common
object instance in train and test scenes.

The dataset can be downloaded from [here](https://drive.google.com/file/d/1TfrP4Sptm6wPMdrn9MrWghfTNAMTCtlY/view?usp=sharing).

### Train

To train your own model:

```
th main.lua -baseLR 1e-3 -end2end -istrain "train"
```

See `data_settings.lua` for additional commandline options.

### Test

To test using the pretrained model and reproduce the results in the paper:

Model
Segmentation
Texture

Visible ∪ Invisible
Visible
Invisible
L1
L2

Multipath
47.51
48.58
6.01
-
-

SeGAN(ours) w/ SV_predicted
68.78
64.76
15.59
0.070
0.023

SeGAN(ours) w/ SV_gt
75.71
68.05
23.26
0.026
0.008

```
th main.lua -weights_segmentation "weights/segment" -end2end -weights_texture "weights/texture" -istrain "test" -predictedSV
```

For testing using the groundtruth visible mask as input instead of the predicted mask:

```
th main.lua -weights_segmentation "weights/segment_gt_sv" -end2end -weights_texture "weights/texture_gt_sv" -istrain "test"
```

## Acknowledgments
Code for GAN network borrows heavily from [pix2pix](https://github.com/phillipi/pix2pix).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ehsanik/segan

Awesome Lists containing this project

README