{"id":18279041,"url":"https://github.com/anuragranj/spynet","last_synced_at":"2025-04-03T03:11:01.524Z","repository":{"id":45205243,"uuid":"68856596","full_name":"anuragranj/spynet","owner":"anuragranj","description":"Spatial Pyramid Network for Optical Flow","archived":false,"fork":false,"pushed_at":"2020-09-23T19:23:25.000Z","size":67185,"stargazers_count":240,"open_issues_count":4,"forks_count":48,"subscribers_count":15,"default_branch":"master","last_synced_at":"2025-03-24T09:19:13.689Z","etag":null,"topics":["convolutional-networks","deep-learning","optical-flow","spatial-pyramid-network","spynet"],"latest_commit_sha":null,"homepage":"","language":"Lua","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/anuragranj.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-09-21T20:52:21.000Z","updated_at":"2025-03-22T12:50:32.000Z","dependencies_parsed_at":"2022-08-12T11:41:07.266Z","dependency_job_id":null,"html_url":"https://github.com/anuragranj/spynet","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anuragranj%2Fspynet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anuragranj%2Fspynet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anuragranj%2Fspynet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anuragranj%2Fspynet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/anuragranj","download_url":"https://codeload.github.com/anuragranj/spynet/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246927835,"owners_count":20856198,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["convolutional-networks","deep-learning","optical-flow","spatial-pyramid-network","spynet"],"created_at":"2024-11-05T12:27:11.799Z","updated_at":"2025-04-03T03:11:01.494Z","avatar_url":"https://github.com/anuragranj.png","language":"Lua","funding_links":[],"categories":["8. Optical Flow \u0026 Scene Flow","7. Optical Flow \u0026 Scene Flow"],"sub_categories":["2.2 Multi View","2.3 自监督"],"readme":"# SPyNet: Spatial Pyramid Network for Optical Flow\nThis code is based on the paper [Optical Flow Estimation using a Spatial Pyramid Network](https://arxiv.org/abs/1611.00850). \n\n[[Unofficial Pytorch version](https://github.com/sniklaus/pytorch-spynet)]  [[Unofficial tensorflow version](https://github.com/tukilabs/Video-Compression-Net/blob/master/utils/network.py)]\n\n* [First things first:](#setUp)  Setting up this code\n* [Easy Usage:](#easyUsage) Compute Optical Flow in 5 lines\n* [Fast Performance Usage:](#fastPerformanceUsage) Compute Optical Flow at a rocket speed\n* [Training:](#training) Train your own models using Spatial Pyramid approach on mulitiple GPUs\n* [End2End SPyNet:](#end2end) An easy trainable end-to-end version of SPyNet\n* [Optical Flow Utilities:](#flowUtils) A set of functions in lua for working around optical flow\n* [References:](#references) For further reading\n\n\u003ca name=\"setUp\"\u003e\u003c/a\u003e\n## First things first\nYou need to have [Torch.](http://torch.ch/docs/getting-started.html#_)\n\nInstall other required packages\n```bash\ncd extras/spybhwd\nluarocks make\ncd ../stnbhwd\nluarocks make\n```\n\u003ca name=\"easyUsage\"\u003e\u003c/a\u003e\n## For Easy Usage, follow this\n#### Set up SPyNet\n```lua\nspynet = require('spynet')\neasyComputeFlow = spynet.easy_setup()\n```\n#### Load images and compute flow\n```lua\nim1 = image.load('samples/00001_img1.ppm' )\nim2 = image.load('samples/00001_img2.ppm' )\nflow = easyComputeFlow(im1, im2)\n```\nTo save your flow fields to a .flo file use [flowExtensions.writeFLO](#writeFLO).\n\n\u003ca name=\"fastPerformanceUsage\"\u003e\u003c/a\u003e\n## For Fast Performace, follow this (recommended)\n#### Set up SPyNet\nSet up SPyNet according to the image size and model. For optimal performance, resize your image such that width and height are a multiple of 32. You can also specify your favorite model. The present supported modes are fine tuned models `sintelFinal`(default), `sintelClean`, `kittiFinal`, and base models `chairsFinal` and `chairsClean`. \n```lua\nspynet = require('spynet')\ncomputeFlow = spynet.setup(512, 384, 'sintelFinal')    -- for 384x512 images\n```\nNow you can call computeFlow anytime to estimate optical flow between image pairs.\n\n#### Computing flow\nLoad an image pair and stack and normalize it.\n```lua\nim1 = image.load('samples/00001_img1.ppm' )\nim2 = image.load('samples/00001_img2.ppm' )\nim = torch.cat(im1, im2, 1)\nim = spynet.normalize(im)\n```\nSPyNet works with batches of data on CUDA. So, compute flow using\n```lua\nim = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda()\nflow = computeFlow(im)\n```\nYou can also use batch-mode, if your images `im` are a tensor of size `Bx6xHxW`, of batch size B with 6 RGB pair channels. You can directly use:\n```lua\nflow = computeFlow(im)\n```\n\u003ca name=\"training\"\u003e\u003c/a\u003e\n## Training\nTraining sequentially is faster than training end-to-end since you need to learn small number of parameters at each level. To train a level `N`, we need the trained models at levels `1` to `N-1`. You also initialize the model with a pretrained model at `N-1`.\n\nE.g. To train level 3, we need trained models at `L1` and `L2`, and we initialize it  `modelL2_3.t7`.\n```bash\nth main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \\\n-cache checkpoint -data FLYING_CHAIRS_DIR \\\n-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \\\n-retrain models/modelL2_3.t7\n```\n\u003ca name=\"end2end\"\u003e\u003c/a\u003e\n## End2End SPyNet\nThe end-to-end version of SPyNet is easily trainable and is available at [anuragranj/end2end-spynet](https://github.com/anuragranj/end2end-spynet).\n\n\u003ca name=\"flowUtils\"\u003e\u003c/a\u003e\n## Optical Flow Utilities\nWe provide `flowExtensions.lua` containing various functions to make your life easier with optical flow while using Torch/Lua. You can just copy this file into your project directory and use if off the shelf.\n```lua\nflowX = require 'flowExtensions'\n```\n#### [flow_magnitude] flowX.computeNorm(flow_x, flow_y)\nGiven `flow_x` and `flow_y` of size `MxN` each, evaluate `flow_magnitude` of size `MxN`.\n\n#### [flow_angle] flowX.computeAngle(flow_x, flow_y)\nGiven `flow_x` and `flow_y` of size `MxN` each, evaluate `flow_angle` of size `MxN` in degrees.\n\n#### [rgb] flowX.field2rgb(flow_magnitude, flow_angle, [max], [legend])\nGiven `flow_magnitude` and `flow_angle` of size `MxN` each, return an image of size `3xMxN` for visualizing optical flow. `max`(optional) specifies maximum flow magnitude and `legend`(optional) is boolean that prints a legend on the image.\n\n#### [rgb] flowX.xy2rgb(flow_x, flow_y, [max])\nGiven `flow_x` and `flow_y` of size `MxN` each, return an image of size `3xMxN` for visualizing optical flow. `max`(optional) specifies maximum flow magnitude.\n\n#### [flow] flowX.loadFLO(filename)\nReads a `.flo` file. Loads `x` and `y` components of optical flow in a 2 channel `2xMxN` optical flow field. First channel stores `x` component and second channel stores `y` component.\n\n\u003ca name=\"writeFLO\"\u003e\u003c/a\u003e\n#### flowX.writeFLO(filename,F)\nWrite a `2xMxN` flow field `F` containing `x` and `y` components of its flow fields in its first and second channel respectively to `filename`, a `.flo` file.\n\n#### [flow] flowX.loadPFM(filename)\nReads a `.pfm` file. Loads `x` and `y` components of optical flow in a 2 channel `2xMxN` optical flow field. First channel stores `x` component and second channel stores `y` component.\n\n#### [flow_rotated] flowX.rotate(flow, angle)\nRotates `flow` of size `2xMxN` by `angle` in radians. Uses nearest-neighbor interpolation to avoid blurring at boundaries.\n\n#### [flow_scaled] flowX.scale(flow, sc, [opt])\nScales `flow` of size `2xMxN` by `sc` times. `opt`(optional) specifies interpolation method, `simple` (default), `bilinear`, and `bicubic`.\n\n#### [flowBatch_scaled] flowX.scaleBatch(flowBatch, sc)\nScales `flowBatch` of size `Bx2xMxN`, a batch of `B` flow fields by `sc` times. Uses nearest-neighbor interpolation.\n\n\u003ca name=\"timing\"\u003e\u003c/a\u003e\n## Timing Benchmarks\nOur timing benchmark is set up on Flying chair dataset. To test it, you need to download\n```bash\nwget http://lmb.informatik.uni-freiburg.de/resources/datasets/FlyingChairs/FlyingChairs.zip\n```\nRun the timing benchmark\n```bash\nth timing_benchmark.lua -data YOUR_FLYING_CHAIRS_DATA_DIRECTORY\n```\n\n\u003ca name=\"references\"\u003e\u003c/a\u003e\n## References\n1. Our warping code is based on [qassemoquab/stnbhwd.](https://github.com/qassemoquab/stnbhwd)\n2. The images in `samples` are from Flying Chairs dataset: \n   Dosovitskiy, Alexey, et al. \"Flownet: Learning optical flow with convolutional networks.\" 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, 2015.\n3. Some parts of `flowExtensions.lua` are adapted from [marcoscoffier/optical-flow](https://github.com/marcoscoffier/optical-flow/blob/master/init.lua) with help from [fguney](https://github.com/fguney).\n4. The unofficial PyTorch implementation is from [sniklaus](https://github.com/sniklaus).\n   \n## License\nFree for non-commercial and scientific research purposes. For commercial use, please contact ps-license@tue.mpg.de. Check LICENSE file for details.\n\n## When using this code, please cite\nRanjan, Anurag, and Michael J. Black. \"Optical Flow Estimation using a Spatial Pyramid Network.\" arXiv preprint arXiv:1611.00850 (2016). \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanuragranj%2Fspynet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fanuragranj%2Fspynet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanuragranj%2Fspynet/lists"}