Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/sunset1995/horizonnet

Pytorch implementation of HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation.
https://github.com/sunset1995/horizonnet
360-photo computer-vision cvpr2019 horizonnet pano-stretch-augmentation room-layout
Last synced: 2 days ago
JSON representation
Pytorch implementation of HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation.
Host: GitHub
URL: https://github.com/sunset1995/horizonnet
Owner: sunset1995
License: mit
Created: 2019-02-18T11:05:44.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2024-04-14T13:30:40.000Z (10 months ago)
Last Synced: 2025-01-28T00:14:31.555Z (10 days ago)
Topics: 360-photo, computer-vision, cvpr2019, horizonnet, pano-stretch-augmentation, room-layout
Language: Python
Homepage: https://sunset1995.github.io/HorizonNet/
Size: 10 MB
Stars: 329
Watchers: 21
Forks: 89
Open Issues: 32
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        # HorizonNet

This is the implementation of our CVPR'19 "[

HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation](https://arxiv.org/abs/1901.03861)" ([project page](https://sunset1995.github.io/HorizonNet/)).

![](assets/teaser.jpg)

### Update

- **2021.11.04: Report results on [Zillow Indoor dataset](https://github.com/zillow/zind). (See [the report :clipboard: on ZInD](README_ZInD.md)).**

- 2021.04.03: Check out our new project [HoHoNet](https://github.com/sunset1995/HoHoNet) on this task and more!

- 2021.03.14: (1) Use mesh instead of point cloud as layout viewer. (2) Update lsd detector dependency.

- 2019.08.19: Report results on [Structured3D dataset](https://structured3d-dataset.org/). (See [the report :clipboard: on ST3D](README_ST3D.md)).

- 2019.06.15: Bug fix for general layout (`dataset.py`, `inference.py` and `misc/post_proc.py`)

### Feature

This repo is a **pure python** implementation that you can:

- **Inference on your images** to get cuboid or general shaped room layout

- **3D layout viewer**

- **Correct rotation pose** to ensure manhattan alignment

- **Pano stretch augmentation** copy and paste to apply on your own task

- **Quantitative evaluatation** of 2D IoU, 3D IoU, Corner Error, Pixel Error of cuboid/general shape

- **Your own dataset** preparation and training

### Method overview

![](assets/pipeline.jpg)

### Installation

Pytorch installation is machine dependent, please install the correct version for your machine. The tested version is pytorch 1.8.1 with python 3.7.6.

   Dependencies (click to expand) 

   - numpy

   - scipy

   - sklearn

   - Pillow

   - tqdm

   - tensorboardX

   - opencv-python>=3.1 (for pre-processing)

   - pylsd-nova

   - open3d>=0.7 (for layout 3D viewer)

   - shapely

### Download

#### Dataset

- PanoContext/Stanford2D3D Dataset

    - [Download preprocessed pano/s2d3d](https://drive.google.com/open?id=1e-MuWRx3T4LJ8Bu4Dc0tKcSHF9Lk_66C) for training/validation/testing

        - Put all of them under `data` directory so you should get:

            ```

            HorizonNet/

            ├──data/

            |  ├──layoutnet_dataset/

            |  |  |--finetune_general/

            |  |  |--test/

            |  |  |--train/

            |  |  |--valid/

            ```

        - `test`, `train`, `valid` are processed from [LayoutNet's cuboid dataset](https://github.com/zouchuhang/LayoutNet).

        - `finetune_general` is re-annotated by us from `train` and `valid`. It contains  65 general shaped rooms.

- Structured3D Dataset

    - See [the tutorial](https://github.com/sunset1995/HorizonNet/blob/master/README_ST3D.md#dataset-preparation) to prepare training/validation/testing for HorizonNet.

- Zillow Indoor Dataset

    - See [the tutorial](https://github.com/sunset1995/HorizonNet/blob/master/README_ZInD.md#dataset-preparation) to prepare training/validation/testing for HorizonNet.

#### Pretrained Models

Plase download the pre-trained model [here](https://drive.google.com/drive/folders/1bgJspDogOHGdwXxCB8o3irU3_Gz9rTpK?usp=drive_link)

- `resnet50_rnn__panos2d3d.pth`

    - Trained on PanoContext/Stanford2d3d 817 pano images.

    - Trained for 300 epoch

- `resnet50_rnn__st3d.pth`

    - Trained on Structured3D 18362 pano images

    - Data setup: original furniture and lighting.

    - Trained for 50 epoch.

- `resnet50_rnn__zind.pth`

    - Trained on Zillow Indoor 20077 pano images.

    - Data setup: `layout_visible`, `is_primary`, `is_inside`, `is_ceiling_flat`.

    - Trained for 50 epoch.

## Inference on your images

In below explaination, I will use `assets/demo.png` for example.

- ![](assets/demo.png) (modified from PanoContext dataset)

### 1. Pre-processing (Align camera rotation pose)

- **Execution**: Pre-process the above `assets/demo.png` by firing below command.

    ```bash

    python preprocess.py --img_glob assets/demo.png --output_dir assets/preprocessed/

    ```

    - `--img_glob` telling the path to your 360 room image(s).

        - support shell-style wildcards with quote (e.g. `"my_fasinated_img_dir/*png"`).

    - `--output_dir` telling the path to the directory for dumping the results.

    - See `python preprocess.py -h` for more detailed script usage help.

- **Outputs**: Under the given `--output_dir`, you will get results like below and prefix with source image basename.

    - The aligned rgb images `[SOURCE BASENAME]_aligned_rgb.png` and line segments images `[SOURCE BASENAME]_aligned_line.png`

        - `demo_aligned_rgb.png` | `demo_aligned_line.png`

          :--------------------: | :---------------------:

          ![](assets/preprocessed/demo_aligned_rgb.png) | ![](assets/preprocessed/demo_aligned_line.png)

    - The detected vanishing points `[SOURCE BASENAME]_VP.txt` (Here `demo_VP.txt`)

        ```

        -0.002278 -0.500449 0.865763

        0.000895 0.865764 0.500452

        0.999999 -0.001137 0.000178

        ```

### 2. Estimating layout with HorizonNet

- **Execution**: Predict the layout from above aligned image and line segments by firing below command.

    ```bash

    python inference.py --pth ckpt/resnet50_rnn__mp3d.pth --img_glob assets/preprocessed/demo_aligned_rgb.png --output_dir assets/inferenced --visualize

    ```

    - `--pth` path to the trained model.

    - `--img_glob` path to the preprocessed image.

    - `--output_dir` path to the directory to dump results.

    - `--visualize` optinoal for visualizing model raw outputs.

    - `--force_cuboid` add this option if you want to estimate cuboid layout (4 walls).

- **Outputs**: You will get results like below and prefix with source image basename.

    - The 1d representation are visualized under file name `[SOURCE BASENAME].raw.png`

    - The extracted corners of the layout `[SOURCE BASENAME].json`

        ```

        {"z0": 50.0, "z1": -59.03114700317383, "uv": [[0.029913906008005142, 0.2996523082256317], [0.029913906008005142, 0.7240479588508606], [0.015625, 0.3819984495639801], [0.015625, 0.6348703503608704], [0.056027885526418686, 0.3881891965866089], [0.056027885526418686, 0.6278984546661377], [0.4480381906032562, 0.3970482349395752], [0.4480381906032562, 0.6178648471832275], [0.5995567440986633, 0.41122356057167053], [0.5995567440986633, 0.601679801940918], [0.8094607591629028, 0.36505699157714844], [0.8094607591629028, 0.6537724137306213], [0.8815288543701172, 0.2661873996257782], [0.8815288543701172, 0.7582473754882812], [0.9189453125, 0.31678876280784607], [0.9189453125, 0.7060701847076416]]}

        ```

### 3. Layout 3D Viewer

- **Execution**: Visualizing the predicted layout in 3D using points cloud.

    ```bash

    python layout_viewer.py --img assets/preprocessed/demo_aligned_rgb.png --layout assets/inferenced/demo_aligned_rgb.json --ignore_ceiling

    ```

    - `--img` path to preprocessed image

    - `--layout` path to the json output from `inference.py`

    - `--ignore_ceiling` prevent showing ceiling

    - See `python layout_viewer.py -h` for usage help.

- **Outputs**: In the window, you can use mouse and scroll wheel to change the viewport

    - ![](assets/demo_3d_layout.jpg)

## Your own dataset

See [tutorial](README_PREPARE_DATASET.md) on how to prepare it.

## Training

To train on a dataset, see `python train.py -h` for detailed options explaination.\

Example:

```bash

python train.py --id resnet50_rnn

```

- Important arguments:

    - `--id` required. experiment id to name checkpoints and logs

    - `--ckpt` folder to output checkpoints (default: ./ckpt)

    - `--logs` folder to logging (default: ./logs)

    - `--pth` finetune mode if given. path to load saved checkpoint.

    - `--backbone` backbone of the network (default: resnet50)

        - other options: `{resnet18,resnet34,resnet50,resnet101,resnet152,resnext50_32x4d,resnext101_32x8d,densenet121,densenet169,densenet161,densenet201}`

    - `--no_rnn` whether to remove rnn (default: False)

    - `--train_root_dir` root directory to training dataset. (default: `data/layoutnet_dataset/train`)

    - `--valid_root_dir` root directory to validation dataset. (default: `data/layoutnet_dataset/valid/`)

        - If giveng, the epoch with best 3DIoU on validation set will be saved as `{ckpt}/{id}/best_valid.pth`

    - `--batch_size_train` training mini-batch size (default: 4)

    - `--epochs` epochs to train (default: 300)

    - `--lr` learning rate (default: 0.0001)

    - `--device` set CUDA enabled device using device id (not to be used if multi_gpu is used)

    - `--multi_gpu` enable parallel computing on all available GPUs

## Quantitative Evaluation - Cuboid Layout

To evaluate on PanoContext/Stanford2d3d dataset, first running the cuboid trained model for all testing images:

```bash

python inference.py --pth ckpt/resnet50_rnn__panos2d3d.pth --img_glob "data/layoutnet_dataset/test/img/*" --output_dir output/panos2d3d/resnet50_rnn/ --force_cuboid

```

- `--img_glob` shell-style wildcards for all testing images.

- `--output_dir` path to the directory to dump results.

- `--force_cuboid` enfoce output cuboid layout (4 walls) or the PE and CE can't be evaluated.

To get the quantitative result:

```bash

python eval_cuboid.py --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/*txt"

```

- `--dt_glob` shell-style wildcards for all the model estimation.

- `--gt_glob` shell-style wildcards for all the ground truth.

If you want to:

- just evaluate PanoContext only `python eval_cuboid.py --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/pano*txt"`

- just evaluate Stanford2d3d only `python eval_cuboid.py --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/camera*txt"`

:clipboard: The quantitative result for the released `resnet50_rnn__panos2d3d.pth` is shown below:

| Testing Dataset | 3D IoU(%) | Corner error(%) | Pixel error(%) |

| :-------------: | :-------: | :------: | :--------------: |

| PanoContext     | `83.39` | `0.76` | `2.13` |

| Stanford2D3D    | `84.09` | `0.63` | `2.06` |

| All             | `83.87` | `0.67` | `2.08` |

## Quantitative Evaluation - General Layout

- See [the report :clipboard: on ST3D](README_ST3D.md) for more detail.

- See [the report :clipboard: on MP3D](README_MP3D.md) for more detail.

## TODO

- Faster pre-processing script (top-fron alignment) (maybe cython implementation or [fernandez2018layouts](https://github.com/cfernandezlab/Lines-and-Vanishing-Points-directly-on-Panoramas))

## Acknowledgement

- Credit of this repo is shared with [ChiWeiHsiao](https://github.com/ChiWeiHsiao).

- Thanks [limchaos](https://github.com/limchaos) for the suggestion about the potential boost by fixing the non-expected behaviour of Pytorch dataloader. (See [Issue#4](https://github.com/sunset1995/HorizonNet/issues/4))

## Citation

```

@inproceedings{SunHSC19,

  author    = {Cheng Sun and

               Chi{-}Wei Hsiao and

               Min Sun and

               Hwann{-}Tzong Chen},

  title     = {HorizonNet: Learning Room Layout With 1D Representation and Pano Stretch

               Data Augmentation},

  booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition, {CVPR}

               2019, Long Beach, CA, USA, June 16-20, 2019},

  pages     = {1047--1056},

  year      = {2019},

}

```