{"id":22068926,"url":"https://github.com/zzh-tech/InterpAny-Clearer","last_synced_at":"2025-07-24T07:31:43.018Z","repository":{"id":203980626,"uuid":"710830571","full_name":"zzh-tech/InterpAny-Clearer","owner":"zzh-tech","description":"[ECCV2024 Oral] Clearer anytime frame interpolation \u0026 Manipulated interpolation of anything","archived":false,"fork":false,"pushed_at":"2024-08-13T04:56:44.000Z","size":140451,"stargazers_count":193,"open_issues_count":5,"forks_count":12,"subscribers_count":6,"default_branch":"main","last_synced_at":"2024-08-14T04:16:45.703Z","etag":null,"topics":["computer-vision","deep-learning","eccv2024","image-manipulation","slomo-filter","temporal-super-resolution","video-editing","video-frame-interpolation","video-generation","video-interpolation","webapp"],"latest_commit_sha":null,"homepage":"https://zzh-tech.github.io/InterpAny-Clearer/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zzh-tech.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-27T14:22:40.000Z","updated_at":"2024-08-14T04:16:45.704Z","dependencies_parsed_at":null,"dependency_job_id":"9e7d4ecf-669c-4e2e-ab91-141de6f72611","html_url":"https://github.com/zzh-tech/InterpAny-Clearer","commit_stats":null,"previous_names":["zzh-tech/interpany-clearer"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zzh-tech%2FInterpAny-Clearer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zzh-tech%2FInterpAny-Clearer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zzh-tech%2FInterpAny-Clearer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zzh-tech%2FInterpAny-Clearer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zzh-tech","download_url":"https://codeload.github.com/zzh-tech/InterpAny-Clearer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227421374,"owners_count":17775011,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","deep-learning","eccv2024","image-manipulation","slomo-filter","temporal-super-resolution","video-editing","video-frame-interpolation","video-generation","video-interpolation","webapp"],"created_at":"2024-11-30T20:04:29.949Z","updated_at":"2024-11-30T20:07:25.085Z","avatar_url":"https://github.com/zzh-tech.png","language":"Python","funding_links":[],"categories":["Paper List"],"sub_categories":["Follow-up Papers"],"readme":"# InterpAny-Clearer\n\n#### :rocket: \u003cu style=\"color: hotpink; text-decoration: underline dotted hotpink;\"\u003e[ECCV2024 Oral] Clearer Frames,\u003c/u\u003e \u003cu style=\"color: dodgerblue; text-decoration: underline dotted dodgerblue;\"\u003eAnytime\u003c/u\u003e: Resolving Velocity Ambiguity in Video Frame Interpolation\n\nby [Zhihang Zhong](https://zzh-tech.github.io/)\u003csup\u003e\n1,*\u003c/sup\u003e, [Gurunandan Krishnan](https://scholar.google.com/citations?user=BKYVv4MAAAAJ\u0026hl=en)\u003csup\u003e\n2\u003c/sup\u003e, [Xiao Sun](https://jimmysuen.github.io/)\u003csup\u003e\n1\u003c/sup\u003e, [Yu Qiao](https://scholar.google.com/citations?user=gFtI-8QAAAAJ\u0026hl=en)\u003csup\u003e\n1\u003c/sup\u003e, [Sizhuo Ma](https://sizhuoma.netlify.app/)\u003csup\u003e2,†\u003c/sup\u003e, and [Jian Wang](https://jianwang-cmu.github.io/)\u003csup\u003e\n2,†\u003c/sup\u003e\n\n\u003csup\u003e*\u003c/sup\u003eFirst author, \u003csup\u003e†\u003c/sup\u003eCo-corresponding authors\n\n\u003csup\u003e1\u003c/sup\u003e[Shanghai AI Laboratory, OpenGVLab](https://github.com/OpenGVLab), \u003csup\u003e\n2\u003c/sup\u003e[Snap Inc.](https://snap.com/en-US)\n\n\u003cbr\u003e\nWe strongly recommend referring to the project page and interactive demo for a better understanding:\n\n:point_right: [**project page**](https://zzh-tech.github.io/InterpAny-Clearer/)  \n:point_right: [**interactive demo**](http://ai4sports.opengvlab.com/interpany-clearer/)  \n:point_right: [**OpenXLab demo**](https://openxlab.org.cn/apps/detail/ZhihangZhong/InterpAny-Clearer)  \n:point_right: [arXiv](http://arxiv.org/abs/2311.08007)  \n:point_right: [slides](https://docs.google.com/presentation/d/1_aIkH_iZUZ2sdSRO9eict1HNAJbX-vQs/edit?usp=sharing\u0026ouid=116575787119851482947\u0026rtpof=true\u0026sd=true)\n\nPlease leave a 🌟 if you like this project! 🔥🔥🔥\n\n#### News\n- :tada: **2024-08-12**: Luckily, this work is recognized as **Oral** by ECCV2024! 🏁\n- :tada: **2024-07-01**: This work is accepted to ECCV2024! 🎆\n- :tada: **2024-05-02**: Our technology is used by [CCTV5 and CCTV5+](./demo/cctv5_interpany-clearer.mp4) for slow motion demonstrations of athletes jumping in the 2024 Thomas \u0026 Uber Cup! 🔥\n- :tada: **2023-11-28**: We have added an interface for video inference to\n  the [interactive demo](http://ai4sports.opengvlab.com/interpany-clearer/), and\n  uploaded [checkpoints](https://drive.google.com/drive/folders/1zCyySQT7Or9P2Q2qOhG116RRdcaDsjr5?usp=sharing) trained\n  with the LPIPS loss.\n\n#### Application in CCTV\n\nhttps://github.com/zzh-tech/InterpAny-Clearer/assets/68437458/a4b0bc95-d051-45ac-aaf1-400266a290d2\n\n#### TL;DR:\n\nWe addressed velocity ambiguity in video frame interpolation through innovative distance indexing and iterative\nreference-based\nestimation strategies, resulting in:  \n\u003cb style=\"color: orangered\"\u003eClearer anytime frame interpolation\u003c/b\u003e \u0026 \u003cb style=\"color: orangered\"\u003eManipulated\ninterpolation of anything\u003c/b\u003e\n\n\u003cimg src=\"./demo/teaser.jpg\"\u003e\n\n#### Time indexing vs. Distance indexing\n\nComparison of x128 interpolation using only 2 frames as inputs:\n\n\u003ctable style=\"width: 1200px\"\u003e\n  \u003ctr\u003e\n    \u003ctd align=\"center\" style=\"font-size:18px; border: none;\"\u003e[T] RIFE\u003c/td\u003e\n    \u003ctd align=\"center\" style=\"font-size:18px; border: none;\"\u003e[D,R] RIFE (Ours)\u003c/td\u003e\n    \u003ctd align=\"center\" style=\"font-size:18px; border: none;\"\u003e[D,R] RIFE-vgg (Ours)\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd valign=\"top\" style=\"border: none;\"\u003e\u003cimg src=\"demo/T-RIFE_0.gif\"\u003e\u003c/td\u003e\n    \u003ctd valign=\"top\" style=\"border: none;\"\u003e\u003cimg src=\"demo/DR-RIFE_0.gif\"\u003e\u003c/td\u003e\n    \u003ctd valign=\"top\" style=\"border: none;\"\u003e\u003cimg src=\"demo/DR-RIFE-vgg_0.gif\"\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd valign=\"top\" style=\"border: none;\"\u003e\u003cimg src=\"demo/T-RIFE_1.gif\"\u003e\u003c/td\u003e\n    \u003ctd valign=\"top\" style=\"border: none;\"\u003e\u003cimg src=\"demo/DR-RIFE_1.gif\"\u003e\u003c/td\u003e\n    \u003ctd valign=\"top\" style=\"border: none;\"\u003e\u003cimg src=\"demo/DR-RIFE-vgg_1.gif\"\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n[D]: distance indexing   \n[R]: iterative reference-based estimation   \n\nThe results of [D,R] RIFE-vgg are perceptually clearer, but may suffer from undesirable distortions (see second row). We\nrecommend using [D,R] RIFE for more stable results.\n\n## Preparation\n\n### Environment installation:\n\nYou can try Anaconda or Docker to setup the environment.\n\n#### Anaconda\n\n```shell\nconda create -n InterpAny python=3.8\nconda activate InterpAny\npip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu116\npip install -r requirements.txt\n```\n\n#### Docker\n\nYou can build a docker image with all dependencies installed.  \nSee [docker/README.md](./docker/README.md) for more details.\n\n### Download checkpoints\nWe provide checkpoints for four different models, including RIFE, IFRNet, AMT-S, and EMA-VFI.\n\nDownload checkpoints\nfrom [here (full)](https://drive.google.com/file/d/14GJSqsX4H5EcQjd-tLb5CM_jzD-577bl/view?usp=sharing) / [here (separate)](https://drive.google.com/drive/folders/11MY60fpDk5oAlGasQRZ3ss3xVdrOBIiE?usp=sharing).\n\n*P.S., RIFE-pro denotes the RIFE model trained with more data and epochs; RIFE-vgg denotes the RIFE model trained with\nthe LPIPS loss.*\n\n## Inference\n\n### Two images\n\n```shell\npython inference_img.py --img0 [IMG_0] --img1 [IMG_1] --output_dir [OUTPUT_DIR] --model [MODEL_NAME] --variant [VARIANT] --num [NUM] --gif\n```\n\nExamples:\n\n`python inference_img.py --img0 ./demo/I0_0.png --img1 ./demo/I0_1.png --model RIFE --variant DR --checkpoint ./checkpoints/RIFE/DR-RIFE-pro --save_dir ./results/I0_results_DR-RIFE-pro --num 1 1 1 1 1 1 1 --gif`\n\n`python inference_img.py --img0 ./demo/I0_0.png --img1 ./demo/I0_1.png --model RIFE --variant DR --checkpoint ./checkpoints/RIFE/DR-RIFE-vgg --save_dir ./results/I0_results_DR-RIFE-vgg --num 1 1 1 1 1 1 1 --gif`\n\n`python inference_img.py --img0 ./demo/I0_0.png --img1 ./demo/I0_1.png --model EMA-VFI --variant DR --checkpoint ./checkpoints/EMA-VFI/DR-EMA-VFI --save_dir ./results/I0_results_DR-EMA-VFI/ --num 1 1 1 1 1 1 1 --gif`\n\n`--num NUM` means to interpolate `NUM` frames between every two frames.   \n`--num NUM1 NUM2 ...` means that `NUM1` frames are interpolated between every two frames, then `NUM2` frames are interpolated between every two frames for the result of the interpolation, and so on.\n\n### Video\n\n```shell\npython inference_video.py --video [VIDEO] --output_dir [OUTPUT_DIR] --model [MODEL_NAME] --variant [VARIANT] --num [NUM]\n```\n\nExamples:\n\n`python inference_video.py --video ./demo/demo.mp4 --model RIFE --variant DR --checkpoint ./checkpoints/RIFE/DR-RIFE-pro --save_dir ./results/demo_results_DR-RIFE-pro --num 3 --fps 15`\n\nP.S., if without `--fps`, the output video will have the same fps as the input video.\n\n## Manipulation\n\n### Manipulated interpolation of anything\n\n\u003cimg src=\"./demo/manipulation.jpg\"/\u003e\n\n### Demos\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003cimg src=\"./demo/manipulation1.gif\"\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cimg src=\"./demo/manipulation2.gif\"\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cimg src=\"./demo/manipulation3.gif\"\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n### Webapp\n\nYou can play with the [interactive demo](http://ai4sports.opengvlab.com/interpany-clearer/) or install the webapp\nlocally.\n\n#### Install the webapp locally\n\nP.S., not required if you use docker\n\nFollow [./webapp/backend/README.md](./webapp/backend/README.md) to setup the environment for Segment Anything.  \nFollow [./webapp/webapp/README.md](./webapp/webapp/README.md) to setup the environment for the webapp.\n\n#### Run the app\n\n```shell\ncd ./webapp/backend/\npython app.py\n\n# open a new terminal\ncd ./webapp/webapp/\nyarn \u0026\u0026 yarn start\n```\n\n## Dataset\n\nYou can download the splited Vimeo90K dataset with our distance indexing maps\nfrom [here](https://drive.google.com/drive/folders/1mYPjleTX3P069hghOad3plGDrUp4d7xJ?usp=sharing) (\nor [full dataet](https://drive.google.com/file/d/1qImY1rLNIcgOu4sX6cQi02br-p-HNs9H/view?usp=sharing)), and then merge\nthem:\n\n```shell\ncat vimeo_septuplet_split.zipa* \u003e vimeo_septuplet_split.zip\n```\n\nAlternatively, you can download original Vimeo90K dataset from [here](http://toflow.csail.mit.edu/), and then generate\ndistance indexing (P.S.\nDownload [checkpoints](https://drive.google.com/drive/folders/1sWDsfuZ3Up38EUQt7-JDTT1HcGHuJgvT?usp=sharing) for RAFT\nand put them under `./RAFT/models/` in advance):\n\n```shell\npython multiprocess_create_dis_index.py\n```\n\n## Train\n\nTraining command:\n\n```shell\npython train.py --model [MODEL_NAME] --variant [VARIANT]\n```\n\nExamples:\n\n`python train.py --model RIFE --variant D`\n\n`python train.py --model RIFE --variant DR`\n\n`python train.py --model AMT-S --variant D`\n\n`python train.py --model AMT-S --variant DR`\n\n## Test\n\nTesting with precomputed distance maps:\n\n```shell\npython test.py --model [MODEL_NAME] --variant [VARIANT]\n```\n\nExamples:\n\n`python test.py --model RIFE --variant D`\n\n`python test.py --model RIFE --variant DR`\n\nTesting using uniform distance maps with the same inputs as the time indexes:\n\n```shell\npython test.py --model [MODEL_NAME] --variant [VARIANT] --uniform\n```\n\nExamples:\n\n`python test.py --model RIFE --variant D --uniform`\n\n`python test.py --model RIFE --variant DR --uniform`\n\n## Citation\n\nIf you find this repository useful, please consider citing:\n\n```bibtex\n@article{zhong2023clearer,\n  title={Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation},\n  author={Zhong, Zhihang and Krishnan, Gurunandan and Sun, Xiao and Qiao, Yu and Ma, Sizhuo and Wang, Jian},\n  journal={arXiv preprint arXiv:2311.08007},\n  year={2023}\n}\n```\n\n## Acknowledgements\n\nWe thank Dorian Chan, Zhirong Wu, and Stephen Lin for their insightful feedback and advice. Our thanks also go to Vu An\nTran for developing the web application, and to Wei Wang for coordinating the user study.\n\nMoreover, we appreciate the following projects for releasing their code:\n\n[[CVPR 2018] The Unreasonable Effectiveness of Deep Features as a Perceptual Metric](https://github.com/richzhang/PerceptualSimilarity)  \n[[ECCV 2020] RAFT: Recurrent All Pairs Field Transforms for Optical Flow](https://github.com/princeton-vl/RAFT)  \n[[ECCV 2022] Real-Time Intermediate Flow Estimation for Video Frame Interpolation](https://github.com/megvii-research/ECCV2022-RIFE)  \n[[CVPR 2022] IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation](https://github.com/ltkong218/IFRNet)  \n[[CVPR 2023] AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation](https://github.com/MCG-NKU/AMT)  \n[[CVPR 2023] Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation](https://github.com/ltkong218/IFRNet)  \n[[ICCV 2023] Segment Anything](https://github.com/facebookresearch/segment-anything)  \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzzh-tech%2FInterpAny-Clearer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzzh-tech%2FInterpAny-Clearer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzzh-tech%2FInterpAny-Clearer/lists"}