Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/baowenbo/DAIN
Depth-Aware Video Frame Interpolation (CVPR 2019)
https://github.com/baowenbo/DAIN
Last synced: 3 months ago
JSON representation
Depth-Aware Video Frame Interpolation (CVPR 2019)
- Host: GitHub
- URL: https://github.com/baowenbo/DAIN
- Owner: baowenbo
- License: mit
- Created: 2019-03-22T02:37:19.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2023-02-13T12:40:12.000Z (almost 2 years ago)
- Last Synced: 2024-10-29T15:29:30.246Z (3 months ago)
- Language: Python
- Homepage: https://sites.google.com/view/wenbobao/dain
- Size: 281 KB
- Stars: 8,222
- Watchers: 187
- Forks: 841
- Open Issues: 76
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-mad - DAIN - Depth-Aware Video Frame Interpolation (CVPR 2019). (插帧 / 特效/实用工具)
- awesome-generative-ai - baowenbo/DAIN - Aware Video Frame Interpolation (CVPR 2019) (Image Segmentation / Creative Uses of Generative AI Image Synthesis Tools)
- StarryDivineSky - baowenbo/DAIN
README
# DAIN (Depth-Aware Video Frame Interpolation)
[Project](https://sites.google.com/view/wenbobao/dain) **|** [Paper](http://arxiv.org/abs/1904.00830)[Wenbo Bao](https://sites.google.com/view/wenbobao/home),
[Wei-Sheng Lai](http://graduatestudents.ucmerced.edu/wlai24/),
[Chao Ma](https://sites.google.com/site/chaoma99/),
Xiaoyun Zhang,
Zhiyong Gao,
and [Ming-Hsuan Yang](http://faculty.ucmerced.edu/mhyang/)IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CVPR 2019
This work is developed based on our TPAMI work [MEMC-Net](https://github.com/baowenbo/MEMC-Net), where we propose the adaptive warping layer. Please also consider referring to it.
### Table of Contents
1. [Introduction](#introduction)
1. [Citation](#citation)
1. [Requirements and Dependencies](#requirements-and-dependencies)
1. [Installation](#installation)
1. [Testing Pre-trained Models](#testing-pre-trained-models)
1. [Downloading Results](#downloading-results)
1. [Slow-motion Generation](#slow-motion-generation)
1. [Training New Models](#training-new-models)
1. [Google Colab Demo](#google-colab-demo)### Introduction
We propose the **D**epth-**A**ware video frame **IN**terpolation (**DAIN**) model to explicitly detect the occlusion by exploring the depth cue.
We develop a depth-aware flow projection layer to synthesize intermediate flows that preferably sample closer objects than farther ones.
Our method achieves state-of-the-art performance on the Middlebury dataset.
We provide videos [here](https://www.youtube.com/watch?v=-f8f0igQi5I&t=5s).
### Citation
If you find the code and datasets useful in your research, please cite:@inproceedings{DAIN,
author = {Bao, Wenbo and Lai, Wei-Sheng and Ma, Chao and Zhang, Xiaoyun and Gao, Zhiyong and Yang, Ming-Hsuan},
title = {Depth-Aware Video Frame Interpolation},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
year = {2019}
}
@article{MEMC-Net,
title={MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement},
author={Bao, Wenbo and Lai, Wei-Sheng, and Zhang, Xiaoyun and Gao, Zhiyong and Yang, Ming-Hsuan},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
doi={10.1109/TPAMI.2019.2941941},
year={2018}
}### Requirements and Dependencies
- Ubuntu (We test with Ubuntu = 16.04.5 LTS)
- Python (We test with Python = 3.6.8 in Anaconda3 = 4.1.1)
- Cuda & Cudnn (We test with Cuda = 9.0 and Cudnn = 7.0)
- PyTorch (The customized depth-aware flow projection and other layers require ATen API in PyTorch = 1.0.0)
- GCC (Compiling PyTorch 1.0.0 extension files (.c/.cu) requires gcc = 4.9.1 and nvcc = 9.0 compilers)
- NVIDIA GPU (We use Titan X (Pascal) with compute = 6.1, but we support compute_50/52/60/61 devices, should you have devices with higher compute capability, please revise [this](https://github.com/baowenbo/DAIN/blob/master/my_package/DepthFlowProjection/setup.py))### Installation
Download repository:$ git clone https://github.com/baowenbo/DAIN.git
Before building Pytorch extensions, be sure you have `pytorch >= 1.0.0`:
$ python -c "import torch; print(torch.__version__)"
Generate our PyTorch extensions:
$ cd DAIN
$ cd my_package
$ ./build.shGenerate the Correlation package required by [PWCNet](https://github.com/NVlabs/PWC-Net/tree/master/PyTorch/external_packages/correlation-pytorch-master):
$ cd ../PWCNet/correlation_package_pytorch1_0
$ ./build.sh### Testing Pre-trained Models
Make model weights dir and Middlebury dataset dir:$ cd DAIN
$ mkdir model_weights
$ mkdir MiddleBurySet
Download pretrained models,$ cd model_weights
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/best.pth
and Middlebury dataset:
$ cd ../MiddleBurySet
$ wget http://vision.middlebury.edu/flow/data/comp/zip/other-color-allframes.zip
$ unzip other-color-allframes.zip
$ wget http://vision.middlebury.edu/flow/data/comp/zip/other-gt-interp.zip
$ unzip other-gt-interp.zip
$ cd ..preinstallations:
$ cd PWCNet/correlation_package_pytorch1_0
$ sh build.sh
$ cd ../my_package
$ sh build.sh
$ cd ..We are good to go by:
$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury.py
The interpolated results are under `MiddleBurySet/other-result-author/[random number]/`, where the `random number` is used to distinguish different runnings.
### Downloading Results
Our DAIN model achieves the state-of-the-art performance on the UCF101, Vimeo90K, and Middlebury ([*eval*](http://vision.middlebury.edu/flow/eval/results/results-n1.php) and *other*).
Download our interpolated results with:
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/UCF101_DAIN.zip
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Vimeo90K_interp_DAIN.zip
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Middlebury_eval_DAIN.zip
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Middlebury_other_DAIN.zip
### Slow-motion Generation
Our model is fully capable of generating slow-motion effect with minor modification on the network architecture.
Run the following code by specifying `time_step = 0.25` to generate x4 slow-motion effect:$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.25
or set `time_step` to `0.125` or `0.1` as follows
$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.125
$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.1
to generate x8 and x10 slow-motion respectively. Or if you would like to have x100 slow-motion for a little fun.
$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.01You may also want to create gif animations by:
$ cd MiddleBurySet/other-result-author/[random number]/Beanbags
$ convert -delay 1 *.png -loop 0 Beanbags.gif //1*10ms delayHave fun and enjoy yourself!
### Training New Models
Download the Vimeo90K triplet dataset for video frame interpolation task, also see [here](https://github.com/anchen1011/toflow/blob/master/download_dataset.sh) by [Xue et al., IJCV19](https://arxiv.org/abs/1711.09078).
$ cd DAIN
$ mkdir /path/to/your/dataset & cd /path/to/your/dataset
$ wget http://data.csail.mit.edu/tofu/dataset/vimeo_triplet.zip
$ unzip vimeo_triplet.zip
$ rm vimeo_triplet.zipDownload the pretrained MegaDepth and PWCNet models
$ cd MegaDepth/checkpoints/test_local
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/best_generalization_net_G.pth
$ cd ../../../PWCNet
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/pwc_net.pth.tar
$ cd ..
Run the training script:$ CUDA_VISIBLE_DEVICES=0 python train.py --datasetPath /path/to/your/dataset --batch_size 1 --save_which 1 --lr 0.0005 --rectify_lr 0.0005 --flow_lr_coe 0.01 --occ_lr_coe 0.0 --filter_lr_coe 1.0 --ctx_lr_coe 1.0 --alpha 0.0 1.0 --patience 4 --factor 0.2
The optimized models will be saved to the `model_weights/[random number]` directory, where [random number] is generated for different runs.Replace the pre-trained `model_weights/best.pth` model with the newly trained `model_weights/[random number]/best.pth` model.
Then test the new model by executing:$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury.py
### Google Colab Demo
This is a modification of DAIN that allows the usage of Google Colab and is able to do a full demo interpolation from a source video to a target video.Original Notebook File by btahir can be found [here](https://github.com/baowenbo/DAIN/issues/44).
To use the Colab, follow these steps:
- Download the `Colab_DAIN.ipynb` file ([link](https://raw.githubusercontent.com/baowenbo/DAIN/master/Colab_DAIN.ipynb)).
- Visit Google Colaboratory ([link](https://colab.research.google.com/))
- Select the "Upload" option, and upload the `.ipynb` file
- Start running the cells one by one, following the instructions.Colab file authors: [Styler00Dollar](https://github.com/styler00dollar) and [Alpha](https://github.com/AlphaGit).
### Contact
[Wenbo Bao](mailto:[email protected]); [Wei-Sheng (Jason) Lai](mailto:[email protected])### License
See [MIT License](https://github.com/baowenbo/DAIN/blob/master/LICENSE)