{"id":18809942,"url":"https://github.com/anchen1011/toflow","last_synced_at":"2025-04-05T18:09:02.627Z","repository":{"id":49901590,"uuid":"104689910","full_name":"anchen1011/toflow","owner":"anchen1011","description":"TOFlow: Video Enhancement with Task-Oriented Flow","archived":false,"fork":false,"pushed_at":"2019-11-11T17:00:23.000Z","size":119637,"stargazers_count":449,"open_issues_count":8,"forks_count":91,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-03-29T17:11:32.101Z","etag":null,"topics":["dataset","deep-learning","interpolation","motion-detection","optical-flow","super-resolution","video","video-deblocking","video-demo","video-denoising","video-processing"],"latest_commit_sha":null,"homepage":"http://toflow.csail.mit.edu","language":"MATLAB","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/anchen1011.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-09-25T01:30:00.000Z","updated_at":"2025-03-23T19:20:02.000Z","dependencies_parsed_at":"2022-08-12T20:50:35.909Z","dependency_job_id":null,"html_url":"https://github.com/anchen1011/toflow","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anchen1011%2Ftoflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anchen1011%2Ftoflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anchen1011%2Ftoflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anchen1011%2Ftoflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/anchen1011","download_url":"https://codeload.github.com/anchen1011/toflow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247378149,"owners_count":20929297,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataset","deep-learning","interpolation","motion-detection","optical-flow","super-resolution","video","video-deblocking","video-demo","video-denoising","video-processing"],"created_at":"2024-11-07T23:18:17.094Z","updated_at":"2025-04-05T18:09:02.595Z","avatar_url":"https://github.com/anchen1011.png","language":"MATLAB","funding_links":[],"categories":[],"sub_categories":[],"readme":"# TOFlow: Video Enhancement with Task-Oriented Flow\n\nThis repository is based on our IJCV publication *TOFlow: Video Enhancement with Task-Oriented Flow* ([PDF](http://toflow.csail.mit.edu/toflow_ijcv.pdf)). It contains pre-trained models and a demo code. It also includes the description and download scripts for the Vimeo-90K dataset we collected. If you used this code or dataset in your work, please cite:\n\n```\n@article{xue2019video,\n  title={Video Enhancement with Task-Oriented Flow},\n  author={Xue, Tianfan and Chen, Baian and Wu, Jiajun and Wei, Donglai and Freeman, William T},\n  journal={International Journal of Computer Vision (IJCV)},\n  volume={127},\n  number={8},\n  pages={1106--1125},\n  year={2019},\n  publisher={Springer}\n}\n```\n\n## Video Demo\n\n[![IMAGE ALT TEXT](data/doc/video.png)](http://www.youtube.com/watch?v=msC5GK9aV9Q \"Video Demo\")\n\nIf you cannot access YouTube, please download 1080p video from [here](http://toflow.csail.mit.edu/toflow.mp4).\n\n## Prerequisites\n\n#### Torch\nOur implementation is based on Torch 7 (http://torch.ch).\n\n#### CUDA [optional]\nCUDA is suggested (https://developer.nvidia.com/cuda-toolkit) for fast inference. The demo code is still runnable without CUDA, but much slower.\n\n#### Matlab [optional]\nWe use Matlab for generating video denoising/super-resolution dataset and quantitative evaluation require Matlab installation (https://www.mathworks.com/products/matlab.html). It is not necessary for the demo code.\n\n#### FFmpeg [optional]\nWe use FFmpeg (http://ffmpeg.org) for generating video deblocking dataset. It is not necessary for the demo code.\n\n## Installation\nOur current release has been tested on Ubuntu 14.04.\n\n#### Clone the repository\n```sh\ngit clone https://github.com/anchen1011/toflow.git\n```\n\n#### Install dependency\n```sh\ncd toflow/src/stnbhwd\nluarocks make\n```\nThis will install 'stn' package for Lua. The list of components:\n```lua\nrequire 'stn'\nnn.AffineGridGeneratorBHWD(height, width)\n-- takes B x 2 x 3 affine transform matrices as input, \n-- outputs a height x width grid in normalized [-1,1] coordinates\n-- output layout is B,H,W,2 where the first coordinate in the 4th dimension is y, and the second is x\nnn.BilinearSamplerBHWD()\n-- takes a table {inputImages, grids} as inputs\n-- outputs the interpolated images according to the grids\n-- inputImages is a batch of samples in BHWD layout\n-- grids is a batch of grids (output of AffineGridGeneratorBHWD)\n-- output is also BHWD\nnn.AffineTransformMatrixGenerator(useRotation, useScale, useTranslation)\n-- takes a B x nbParams tensor as inputs\n-- nbParams depends on the contrained transformation\n-- The parameters for the selected transformation(s) should be supplied in the\n-- following order: rotationAngle, scaleFactor, translationX, translationY\n-- If no transformation is specified, it generates a generic affine transformation (nbParams = 6)\n-- outputs B x 2 x 3 affine transform matrices\n```\n\n#### Download pretrained models (104MB) \n```sh\ncd ../../\n./download_models.sh\n``` \n\n## Run Demo Code\n```sh\ncd src\nth demo.lua -mode interp -inpath ../data/example/low_frame_rate\nth demo.lua -mode denoise -inpath ../data/example/noisy\nth demo.lua -mode deblock -inpath ../data/example/block\nth demo.lua -mode sr -inpath ../data/example/blur\n```\n\nThere are a few options in demo.lua:\n\n**nocuda**: Set this option when CUDA is not available.\n\n**gpuId**: GPU device ID.\n\n**mode**: There are four options:\n- 'interp': temporal frame interpolation\n- 'denoise': video denoising \n- 'deblock': video deblocking\n- 'sr': video super-resolution\n\n**inpath**: The path to the input sequence.\n\n**outpath**: The path to where the result stores (default is ../demo_output).\n\n\n## Vimeo-90K Dataset\n\nWe also build a large-scale, high-quality video dataset, Vimeo-90K, designed for the following four video processing tasks: temporal frame interpolation, video denoising, video deblocking, and video super-resolution.\n\nVimeo-90K is built upon 5,846 selected videos downloaded from  [vimeo.com](vimeo.com), which covers large variaty of scenes and actions. This video set is a subset of Vimeo-90K dataset is a subset of [AoT dataset](https://github.com/donglaiw/AoT_Dataset) and all video links are [here](data/original_vimeo_links.txt).\n\n![This image cannot be displayed. Please open this link in another browser: https://github.com/anchen1011/toflow/raw/master/data/doc/dataset.png](data/doc/dataset.png)\n\nWe further chop these videos to 89,800 video clips and build two datasets from these clips:\n\n#### Triplet dataset for temporal frame interpolation\n\nThe triplet dataset consists of 73171 3-frame sequences with a fixed resolution of 448 x 256, extracted from 15k selected video clips from Vimeo-90K. This dataset is designed for temporal frame interpolation. Download links are:\n\nTest set only: [zip (1.7GB)](http://data.csail.mit.edu/tofu/testset/vimeo_interp_test.zip).\n\nBoth training and test set: [zip (33GB)](http://data.csail.mit.edu/tofu/dataset/vimeo_triplet.zip).\n\n#### Septuplet dataset for video denoising, super-resolution, and deblocking\n\nThe septuplet dataset consists of 91701 7-frame sequences with fixed resolution 448 x 256, extracted from 39k selected video clips from Vimeo-90k. This dataset is designed to video denoising, deblocking, and super-resolution.\n\nThe test set for video denoising: [zip (16GB)](http://data.csail.mit.edu/tofu/testset/vimeo_denoising_test.zip).\n\nThe test set for video deblocking: [zip (11GB)](http://data.csail.mit.edu/tofu/testset/vimeo_sep_block.zip).\n\nThe test set for video super-resolution: [zip (6GB)](http://data.csail.mit.edu/tofu/testset/vimeo_super_resolution_test.zip).\n\nThe original test set (not downsampled or downgraded by noise): [zip (15GB)](http://data.csail.mit.edu/tofu/testset/vimeo_test_clean.zip).\n\nThe original training + test set (consists of 91701 sequences, which are not downsampled or downgraded by noise): [zip (82GB)](http://data.csail.mit.edu/tofu/dataset/vimeo_septuplet.zip).\n\n#### Generate Testing Sequences\n\nSee src/generate_testing_sample for the functions to generate noisy/low-resolution sequences.\n\nTo generate noisy sequences with Matlab under src/generate_testing_sample, run\n```\nadd_noise_to_input(data_path, output_path);\n``` \nand the results will be stored under output_path\n\nTo generate blur sequences with Matlab, run\n```\nblur_input(data_path, output_path);\n```\nand the results will be stored under output_path\n\nBlocky sequences are compressed by FFmpeg. Our test set is generated with the following configuration:\n```sh\nffmpeg -i *.png -q 20 -vcodec jpeg2000 -format j2k name.mov \n```\n\n## Run Quantitative Evaluation\n\n#### Download all four Vimeo testsets (52G) \n```sh\n./download_testset.sh\n``` \n\n#### Run inference on Vimeo testsets\n```sh\ncd src\nth demo_vimeo90k.lua -mode interp\nth demo_vimeo90k.lua -mode denoise\nth demo_vimeo90k.lua -mode deblock\nth demo_vimeo90k.lua -mode sr\n```\n\n#### Evaluation\n\nWe use three metrics to evaluate the performance of our algorithm: PSNR, SSIM, and Abs metrics. To run evaluation, execute following commands in Matlab:\n```\ncd src/evaluation\nevaluate(output_dir, target_dir);\n``` \n\nFor example, to evaluate results generated in the previous step, run\n```\ncd src/evaluation\nevaluate('../../output/interp', '../../data/vimeo_interp_test/target', 'interp')\nevaluate('../../output/denoise', '../../data/vimeo_test_clean/sequences', 'denoise')\nevaluate('../../output/deblock', '../../data/vimeo_test_clean/sequences', 'deblock')\nevaluate('../../output/sr', '../../data/vimeo_test_clean/sequences', 'sr')\n```\n\nIt is assumed that our datasets are unzipped under data/ and not renamed. It is also assumed that results are put under [output_root]/[task_name] e.g. output/sr output/interp output/denoise output/deblock, with exactly the same subfolder structure as our datasets.\n\n## References\n1. Our warping code is based on [qassemoquab/stnbhwd](https://github.com/qassemoquab/stnbhwd).\n2. Our flow utilities and transformation utilities are based on [anuragranj/spynet](https://github.com/anuragranj/spynet)\n3. There is an unofficial PyTorch implementation by [coldog2333/pytoflow](https://github.com/Coldog2333/pytoflow)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanchen1011%2Ftoflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fanchen1011%2Ftoflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanchen1011%2Ftoflow/lists"}