{"id":13512263,"url":"https://github.com/baowenbo/DAIN","last_synced_at":"2025-03-30T22:32:18.900Z","repository":{"id":41086471,"uuid":"177059049","full_name":"baowenbo/DAIN","owner":"baowenbo","description":"Depth-Aware Video Frame Interpolation (CVPR 2019)","archived":false,"fork":false,"pushed_at":"2023-02-13T12:40:12.000Z","size":288,"stargazers_count":8292,"open_issues_count":76,"forks_count":841,"subscribers_count":184,"default_branch":"master","last_synced_at":"2025-03-27T12:08:26.607Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://sites.google.com/view/wenbobao/dain","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/baowenbo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-03-22T02:37:19.000Z","updated_at":"2025-03-27T03:11:59.000Z","dependencies_parsed_at":"2023-10-20T18:17:51.543Z","dependency_job_id":null,"html_url":"https://github.com/baowenbo/DAIN","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baowenbo%2FDAIN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baowenbo%2FDAIN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baowenbo%2FDAIN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baowenbo%2FDAIN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/baowenbo","download_url":"https://codeload.github.com/baowenbo/DAIN/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246390887,"owners_count":20769475,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T03:01:39.700Z","updated_at":"2025-03-30T22:32:15.917Z","avatar_url":"https://github.com/baowenbo.png","language":"Python","funding_links":[],"categories":["Python","HarmonyOS","Uncategorized","插帧","Image Segmentation","视频生成、补帧、摘要"],"sub_categories":["Windows Manager","Uncategorized","特效/实用工具","Creative Uses of Generative AI Image Synthesis Tools","网络服务_其他"],"readme":"# DAIN (Depth-Aware Video Frame Interpolation)\n[Project](https://sites.google.com/view/wenbobao/dain) **|** [Paper](http://arxiv.org/abs/1904.00830)\n\n[Wenbo Bao](https://sites.google.com/view/wenbobao/home),\n[Wei-Sheng Lai](http://graduatestudents.ucmerced.edu/wlai24/), \n[Chao Ma](https://sites.google.com/site/chaoma99/),\nXiaoyun Zhang, \nZhiyong Gao, \nand [Ming-Hsuan Yang](http://faculty.ucmerced.edu/mhyang/)\n\nIEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CVPR 2019\n\nThis work is developed based on our TPAMI work [MEMC-Net](https://github.com/baowenbo/MEMC-Net), where we propose the adaptive warping layer. Please also consider referring to it.\n\n### Table of Contents\n1. [Introduction](#introduction)\n1. [Citation](#citation)\n1. [Requirements and Dependencies](#requirements-and-dependencies)\n1. [Installation](#installation)\n1. [Testing Pre-trained Models](#testing-pre-trained-models)\n1. [Downloading Results](#downloading-results)\n1. [Slow-motion Generation](#slow-motion-generation)\n1. [Training New Models](#training-new-models)\n1. [Google Colab Demo](#google-colab-demo)\n\n### Introduction\nWe propose the **D**epth-**A**ware video frame **IN**terpolation (**DAIN**) model to explicitly detect the occlusion by exploring the depth cue.\nWe develop a depth-aware flow projection layer to synthesize intermediate flows that preferably sample closer objects than farther ones.\nOur method achieves state-of-the-art performance on the Middlebury dataset. \nWe provide videos [here](https://www.youtube.com/watch?v=-f8f0igQi5I\u0026t=5s).\n\n\u003c!--![teaser](http://vllab.ucmerced.edu/wlai24/LapSRN/images/emma_text.gif)--\u003e\n\n\u003c!--[![teaser](https://img.youtube.com/vi/icJ0WbPsE20/0.jpg)](https://www.youtube.com/watch?v=icJ0WbPsE20\u0026feature=youtu.be)\n\u003c!--\u003ciframe width=\"560\" height=\"315\" src=\"https://www.youtube.com/embed/icJ0WbPsE20\" frameborder=\"0\" allow=\"accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen\u003e\u003c/iframe\u003e\n![teaser](http://vllab1.ucmerced.edu/~wenbobao/DAIN/kart-turn_compare.gif)\n\n\n\u003c!--哈哈我是注释，不会在浏览器中显示。\nBeanbags\nhttps://drive.google.com/open?id=170vdxANGoNKO5_8MYOuiDvoIXzucv7HW\nDimentrodon\nhttps://drive.google.com/open?id=14n7xvb9hjTKqfcr7ZpEFyfMvx6E8NhD_\nDogDance\nhttps://drive.google.com/open?id=1YWAyAJ3T48fMFv2K8j8wIVcmQm39cRof\nGrove2\nhttps://drive.google.com/open?id=1sJLwdQdL6JYXSQo_Bev0aQMleWacxCsN\nGrove3\nhttps://drive.google.com/open?id=1jGj3UdGppoJO02Of8ZaNXqDH4fnXuQ8O\nHydrangea\nhttps://drive.google.com/open?id=1_4kVlhvrmCv54aXi7vZMk3-FtRQF7s0s\nMiniCooper\nhttps://drive.google.com/open?id=1pWHtyBSZsOTC7NTVdHTrv1W-dxa95BLo\nRubberWhale\nhttps://drive.google.com/open?id=1korbXsGpSgJn7THBHkLRVrJMtCt5YZPB\nUrban2\nhttps://drive.google.com/open?id=1v57RMm9x5vM36mCgPy5hresXDZWtw3Vs\nUrban3\nhttps://drive.google.com/open?id=1LMwSU0PrG4_GaDjWRI2v9hvWpYwzRKca\nVenus\nhttps://drive.google.com/open?id=1piPnEexuHaiAr4ZzWSAxGi1u1Xo_6vPp\nWalking\nhttps://drive.google.com/open?id=1CgCLmVC_WTVTAcA_IdWbLqR8MS18zHoa\n--\u003e\n\n\u003cp float=\"middle\"\u003e\n\u003cimg src=\"https://drive.google.com/uc?export=view\u0026id=1YWAyAJ3T48fMFv2K8j8wIVcmQm39cRof\" width=\"200\"/\u003e\n\u003cimg src=\"https://drive.google.com/uc?export=view\u0026id=1CgCLmVC_WTVTAcA_IdWbLqR8MS18zHoa\" width=\"200\"/\u003e\n\u003cimg src=\"https://drive.google.com/uc?export=view\u0026id=1pWHtyBSZsOTC7NTVdHTrv1W-dxa95BLo\" width=\"200\"/\u003e\n\u003cimg src=\"https://drive.google.com/uc?export=view\u0026id=170vdxANGoNKO5_8MYOuiDvoIXzucv7HW\" width=\"200\"/\u003e\n\u003c/p\u003e\n\n\u003cp float=\"middle\"\u003e\n\u003cimg src=\"https://drive.google.com/uc?export=view\u0026id=1sJLwdQdL6JYXSQo_Bev0aQMleWacxCsN\" width=\"200\"/\u003e\n\u003cimg src=\"https://drive.google.com/uc?export=view\u0026id=1jGj3UdGppoJO02Of8ZaNXqDH4fnXuQ8O\" width=\"200\"/\u003e\n\u003cimg src=\"https://drive.google.com/uc?export=view\u0026id=1v57RMm9x5vM36mCgPy5hresXDZWtw3Vs\" width=\"200\"/\u003e\n\u003cimg src=\"https://drive.google.com/uc?export=view\u0026id=1LMwSU0PrG4_GaDjWRI2v9hvWpYwzRKca\" width=\"200\"/\u003e\n\u003c/p\u003e\n\n\u003cp float=\"middle\"\u003e\n\u003cimg src=\"https://drive.google.com/uc?export=view\u0026id=1piPnEexuHaiAr4ZzWSAxGi1u1Xo_6vPp\" width=\"200\"/\u003e\n\u003cimg src=\"https://drive.google.com/uc?export=view\u0026id=1korbXsGpSgJn7THBHkLRVrJMtCt5YZPB\" width=\"200\"/\u003e\n\u003cimg src=\"https://drive.google.com/uc?export=view\u0026id=1_4kVlhvrmCv54aXi7vZMk3-FtRQF7s0s\" width=\"200\"/\u003e\n\u003cimg src=\"https://drive.google.com/uc?export=view\u0026id=14n7xvb9hjTKqfcr7ZpEFyfMvx6E8NhD_\" width=\"200\"/\u003e\n\u003c/p\u003e\n\n### Citation\nIf you find the code and datasets useful in your research, please cite:\n\n    @inproceedings{DAIN,\n        author    = {Bao, Wenbo and Lai, Wei-Sheng and Ma, Chao and Zhang, Xiaoyun and Gao, Zhiyong and Yang, Ming-Hsuan}, \n        title     = {Depth-Aware Video Frame Interpolation}, \n        booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},\n        year      = {2019}\n    }\n    @article{MEMC-Net,\n         title={MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement},\n         author={Bao, Wenbo and Lai, Wei-Sheng, and Zhang, Xiaoyun and Gao, Zhiyong and Yang, Ming-Hsuan},\n         journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},\n         doi={10.1109/TPAMI.2019.2941941},\n         year={2018}\n    }\n\n### Requirements and Dependencies\n- Ubuntu (We test with Ubuntu = 16.04.5 LTS)\n- Python (We test with Python = 3.6.8 in Anaconda3 = 4.1.1)\n- Cuda \u0026 Cudnn (We test with Cuda = 9.0 and Cudnn = 7.0)\n- PyTorch (The customized depth-aware flow projection and other layers require ATen API in PyTorch = 1.0.0)\n- GCC (Compiling PyTorch 1.0.0 extension files (.c/.cu) requires gcc = 4.9.1 and nvcc = 9.0 compilers)\n- NVIDIA GPU (We use Titan X (Pascal) with compute = 6.1, but we support compute_50/52/60/61 devices, should you have devices with higher compute capability, please revise [this](https://github.com/baowenbo/DAIN/blob/master/my_package/DepthFlowProjection/setup.py))\n\n### Installation\nDownload repository:\n\n    $ git clone https://github.com/baowenbo/DAIN.git\n\nBefore building Pytorch extensions, be sure you have `pytorch \u003e= 1.0.0`:\n    \n    $ python -c \"import torch; print(torch.__version__)\"\n    \nGenerate our PyTorch extensions:\n    \n    $ cd DAIN\n    $ cd my_package \n    $ ./build.sh\n\nGenerate the Correlation package required by [PWCNet](https://github.com/NVlabs/PWC-Net/tree/master/PyTorch/external_packages/correlation-pytorch-master):\n    \n    $ cd ../PWCNet/correlation_package_pytorch1_0\n    $ ./build.sh\n\n\n### Testing Pre-trained Models\nMake model weights dir and Middlebury dataset dir:\n\n    $ cd DAIN\n    $ mkdir model_weights\n    $ mkdir MiddleBurySet\n    \nDownload pretrained models, \n\n    $ cd model_weights\n    $ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/best.pth\n    \nand Middlebury dataset:\n    \n    $ cd ../MiddleBurySet\n    $ wget http://vision.middlebury.edu/flow/data/comp/zip/other-color-allframes.zip\n    $ unzip other-color-allframes.zip\n    $ wget http://vision.middlebury.edu/flow/data/comp/zip/other-gt-interp.zip\n    $ unzip other-gt-interp.zip\n    $ cd ..\n\npreinstallations:\n\n    $ cd PWCNet/correlation_package_pytorch1_0\n    $ sh build.sh\n    $ cd ../my_package\n    $ sh build.sh\n    $ cd ..\n\nWe are good to go by:\n\n    $ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury.py\n\nThe interpolated results are under `MiddleBurySet/other-result-author/[random number]/`, where the `random number` is used to distinguish different runnings. \n\n### Downloading Results\nOur DAIN model achieves the state-of-the-art performance on the UCF101, Vimeo90K, and Middlebury ([*eval*](http://vision.middlebury.edu/flow/eval/results/results-n1.php) and *other*).\nDownload our interpolated results with:\n    \n    $ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/UCF101_DAIN.zip\n    $ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Vimeo90K_interp_DAIN.zip\n    $ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Middlebury_eval_DAIN.zip\n    $ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Middlebury_other_DAIN.zip\n    \n    \n### Slow-motion Generation\nOur model is fully capable of generating slow-motion effect with minor modification on the network architecture.\nRun the following code by specifying `time_step = 0.25` to generate x4 slow-motion effect:\n\n    $ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.25\n\nor set `time_step` to `0.125` or `0.1` as follows \n\n    $ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.125\n    $ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.1\nto generate x8 and x10 slow-motion respectively. Or if you would like to have x100 slow-motion for a little fun.\n    \n    $ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.01\n\nYou may also want to create gif animations by:\n    \n    $ cd MiddleBurySet/other-result-author/[random number]/Beanbags\n    $ convert -delay 1 *.png -loop 0 Beanbags.gif //1*10ms delay \n\nHave fun and enjoy yourself! \n\n\n### Training New Models\nDownload the Vimeo90K triplet dataset for video frame interpolation task, also see [here](https://github.com/anchen1011/toflow/blob/master/download_dataset.sh) by [Xue et al., IJCV19](https://arxiv.org/abs/1711.09078).\n    \n    $ cd DAIN\n    $ mkdir /path/to/your/dataset \u0026 cd /path/to/your/dataset \n    $ wget http://data.csail.mit.edu/tofu/dataset/vimeo_triplet.zip\n    $ unzip vimeo_triplet.zip\n    $ rm vimeo_triplet.zip\n\nDownload the pretrained MegaDepth and PWCNet models\n    \n    $ cd MegaDepth/checkpoints/test_local\n    $ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/best_generalization_net_G.pth\n    $ cd ../../../PWCNet\n    $ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/pwc_net.pth.tar\n    $ cd  ..\n    \nRun the training script:\n\n    $ CUDA_VISIBLE_DEVICES=0 python train.py --datasetPath /path/to/your/dataset --batch_size 1 --save_which 1 --lr 0.0005 --rectify_lr 0.0005 --flow_lr_coe 0.01 --occ_lr_coe 0.0 --filter_lr_coe 1.0 --ctx_lr_coe 1.0 --alpha 0.0 1.0 --patience 4 --factor 0.2\n    \nThe optimized models will be saved to the `model_weights/[random number]` directory, where [random number] is generated for different runs.\n\nReplace the pre-trained `model_weights/best.pth` model with the newly trained `model_weights/[random number]/best.pth` model.\nThen test the new model by executing: \n\n    $ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury.py\n\n### Google Colab Demo\nThis is a modification of DAIN that allows the usage of Google Colab and is able to do a full demo interpolation from a source video to a target video.\n\nOriginal Notebook File by btahir can be found [here](https://github.com/baowenbo/DAIN/issues/44).\n\nTo use the Colab, follow these steps:\n\n- Download the `Colab_DAIN.ipynb` file ([link](https://raw.githubusercontent.com/baowenbo/DAIN/master/Colab_DAIN.ipynb)).\n- Visit Google Colaboratory ([link](https://colab.research.google.com/))\n- Select the \"Upload\" option, and upload the `.ipynb` file\n- Start running the cells one by one, following the instructions.\n\nColab file authors: [Styler00Dollar](https://github.com/styler00dollar) and [Alpha](https://github.com/AlphaGit).\n\n### Contact\n[Wenbo Bao](mailto:bwb0813@gmail.com); [Wei-Sheng (Jason) Lai](mailto:phoenix104104@gmail.com)\n\n### License\nSee [MIT License](https://github.com/baowenbo/DAIN/blob/master/LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbaowenbo%2FDAIN","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbaowenbo%2FDAIN","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbaowenbo%2FDAIN/lists"}