Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/YvanYin/DiverseDepth

The code and data of DiverseDepth
https://github.com/YvanYin/DiverseDepth

dataset depth-estimation depth-prediction generalization-on-diverse-scenes monocular-depth-estimation single-image-depth-prediction

Last synced: 2 months ago
JSON representation

The code and data of DiverseDepth

Host: GitHub
URL: https://github.com/YvanYin/DiverseDepth
Owner: YvanYin
License: other
Created: 2020-03-27T00:36:03.000Z (about 4 years ago)
Default Branch: master
Last Pushed: 2022-07-26T02:23:57.000Z (almost 2 years ago)
Last Synced: 2024-01-21T04:59:54.657Z (5 months ago)
Topics: dataset, depth-estimation, depth-prediction, generalization-on-diverse-scenes, monocular-depth-estimation, single-image-depth-prediction
Language: Python
Size: 88.2 MB
Stars: 213
Watchers: 13
Forks: 24
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Lists

awesome-rgbd-datasets - DiverseDepth Dataset - the-wild |Color | |320000 images |2020 | (RGB-D Datasets <a id="list" class="anchor" href="#list" aria-hidden="true"><span class="octicon octicon-link"></span></a>)

README

        #### DiverseDepth Project

This project aims to improve the generalization ability of the monocular depth estimation method on diverse scenes. We propose a learning method and a diverse dataset, termed DiverseDepth, to solve this problem.  The [DiverseDepth](https://arxiv.org/abs/2002.00569) contents have been published in our "Virtual Normal" TPAMI version.

This repository contains the source code of our paper (the DiverseDepth part):

1. [Wei Yin, Yfan Liu, Chunhua Shen, Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust Depth Prediction](https://arxiv.org/abs/2103.04216).

2. [Wei Yin, Xinlong Wang, Chunhua Shen, Yifan Liu, Zhi Tian, Songcen Xu, Changming Sun, Dou Renyin. DiverseDepth: Affine-invariant Depth Prediction Using Diverse Data

](https://arxiv.org/abs/2002.00569)

Training codes have been released!!

## Some Results

![Any images online](./examples/any_imgs.jpg)

![Point cloud](./examples/pcd.png)

## Some Dataset Examples

![Dataset](./examples/dataset_examples.png)

****

## Hightlights

- **Generalization:** Our method demonstrates strong generalization ability on several zero-shot datasets. The predicted depth is affine-invairiant.

****

## Installation

- Please refer to [Installation](./Installation.md).

## Datasets

We collect multi-source data to construct our DiverseDepth dataset. The It consists of three parts:

Part-in (collected from taskonomy):  contains over 100K images

Part-out (collected from DIML, we have reprocessed its disparity): contains over 120K images

Part-fore (collected from webstereo images and videos): contains 109703 images. 

We used the [GNet](https://github.com/feihuzhang/GANet) method to recompute the disparity of DIML data instead of original provided disparity maps. 

We provide two ways to download data. 

1) Download from Cloudstor. You can download them with the following method.

```

sh download_data.sh

```

2) Download from Google Drive. See [here](./download_data_google_drive.md). for details.

## Quick Start (Inference)

1. Download the model weights

   * [ResNeXt50 backbone](https://cloudstor.aarnet.edu.au/plus/s/ixWf3nTJFZ0YE4q)

2. Prepare data. 

   * Move the downloaded weights to  `/` 

   * Put the testing RGB images to `/Minist_Test/test_images/`. Predicted depths and reconstructed point cloud are saved under `/Minist_Test/test_images/outputs`

3. Test monocular depth prediction. Note that the predicted depths are affine-invariant. 

```bash

export PYTHONPATH=""

# run the ResNet-50

python ./Minist_Test/tools/test_depth.py --load_ckpt model.pth

 

```

## Training

1. Download the ResNeXt pretrained weight and put it under `Train/datasets/resnext_pretrain`

   * [ResNeXt50](https://cloudstor.aarnet.edu.au/plus/s/J87DYsTlOjD83LR)

2. Download the training data. Refer to 'download_data.sh'. All data are organized under the `Train/datasets`. The structure of all data are as follows. 

```

|--Train

|--data

|--tools

|--scripts

|--datasets

|    |--DiverseDepth

|    |   |--annotations

|    |   |--depths

|    |   |--rgbs

|    |--taskonomy

|    |   |--annotations

|    |   |--depths

|    |   |--rgbs

|    |   |--ins_planes

|    |--DIML_GANet

|    |   |--annotations

|    |   |--depth

|    |   |--rgb

|    |   |--sky_mask

|    |--resnext_pretrain

|    |   |--resnext50_32x4d.pth

```

3. Train the network. The default setting used 4 gpus. If you want to use more gpus, please set `$CUDA_VISIBLE_DEVICES`, such as `export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7`.

   The `--batchsize` is the number of samples on a single gpu. 

   ```

   cd Train/scripts

   sh train.sh

   ```

4. Test the network on a benchmark. We provide a sample code for testing on NYU. Please download the NYU testing data `test.mat` for evaluation. If you want to test on other benchmarks, you can follow the sample code.

   ```

   cd Train/scripts

   sh test.sh

   ```

### Citation

```

@article{yin2021virtual,

  title={Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust Depth Prediction},

  author={Yin, Wei and Liu, Yifan and Shen, Chunhua},

  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},

  year={2021}

}

```

### Contact

Wei Yin: [email protected]