{"id":15028229,"url":"https://github.com/fyusion/llff","last_synced_at":"2025-05-15T21:05:28.687Z","repository":{"id":37627092,"uuid":"184649653","full_name":"Fyusion/LLFF","owner":"Fyusion","description":"Code release for Local Light Field Fusion at SIGGRAPH 2019","archived":false,"fork":false,"pushed_at":"2023-06-19T14:26:41.000Z","size":32716,"stargazers_count":1597,"open_issues_count":48,"forks_count":250,"subscribers_count":45,"default_branch":"master","last_synced_at":"2025-04-08T04:14:37.544Z","etag":null,"topics":["deep-learning","light-field","rendering","view-synthesis"],"latest_commit_sha":null,"homepage":"https://fyusion.com/LLFF","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Fyusion.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-05-02T20:48:57.000Z","updated_at":"2025-04-04T07:17:54.000Z","dependencies_parsed_at":"2022-07-27T16:02:57.626Z","dependency_job_id":"c8643e42-8563-472a-846b-d94c70a0b720","html_url":"https://github.com/Fyusion/LLFF","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Fyusion%2FLLFF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Fyusion%2FLLFF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Fyusion%2FLLFF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Fyusion%2FLLFF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Fyusion","download_url":"https://codeload.github.com/Fyusion/LLFF/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254422761,"owners_count":22068678,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","light-field","rendering","view-synthesis"],"created_at":"2024-09-24T20:07:51.614Z","updated_at":"2025-05-15T21:05:28.664Z","avatar_url":"https://github.com/Fyusion.png","language":"C++","readme":"\u003cimg src='imgs/output6_120.gif' align=\"right\" height=\"120px\"\u003e\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n# Local Light Field Fusion\n### [Project](https://bmild.github.io/llff) | [Video](https://youtu.be/LY6MgDUzS3M) | [Paper](https://arxiv.org/abs/1905.00889) \n\nTensorflow implementation for novel view synthesis from sparse input images.\u003cbr\u003e\u003cbr\u003e\n[Local Light Field Fusion: Practical View Synthesis \nwith Prescriptive Sampling Guidelines](https://bmild.github.io/llff)  \n [Ben Mildenhall](https://people.eecs.berkeley.edu/~bmild/)\\*\u003csup\u003e1\u003c/sup\u003e, \n [Pratul Srinivasan](https://people.eecs.berkeley.edu/~pratul/)\\*\u003csup\u003e1\u003c/sup\u003e, \n [Rodrigo Ortiz-Cayon](https://scholar.google.com/citations?user=yZMAlU4AAAAJ)\u003csup\u003e2\u003c/sup\u003e, \n [Nima Khademi Kalantari](http://faculty.cs.tamu.edu/nimak/)\u003csup\u003e3\u003c/sup\u003e, \n [Ravi Ramamoorthi](http://cseweb.ucsd.edu/~ravir/)\u003csup\u003e4\u003c/sup\u003e, \n [Ren Ng](https://www2.eecs.berkeley.edu/Faculty/Homepages/yirenng.html)\u003csup\u003e1\u003c/sup\u003e, \n [Abhishek Kar](https://abhishekkar.info/)\u003csup\u003e2\u003c/sup\u003e  \n \u003csup\u003e1\u003c/sup\u003eUC Berkeley, \u003csup\u003e2\u003c/sup\u003eFyusion Inc, \u003csup\u003e3\u003c/sup\u003eTexas A\u0026amp;M, \u003csup\u003e4\u003c/sup\u003eUC San Diego  \n  \\*denotes equal contribution  \n  In SIGGRAPH 2019\n  \n\n\u003cimg src='imgs/teaser.jpg'/\u003e\n\n## Table of Contents\n\n  * [Installation TL;DR: Setup and render a demo scene](#installation-tldr-setup-and-render-a-demo-scene)\n  * [Full Installation Details](#full-installation-details)\n    * [Manual installation](#manual-installation)\n    * [Docker installation](#docker-installation)\n  * [Using your own input images for view synthesis](#using-your-own-input-images-for-view-synthesis)\n    * [Quickstart: rendering a video from a zip file of your images](#quickstart-rendering-a-video-from-a-zip-file-of-your-images)\n  * [General step-by-step usage](#general-step-by-step-usage)\n    * [1. Recover camera poses](#1-recover-camera-poses)\n    * [2. Generate MPIs](#2-generate-mpis)\n    * [3. Render novel views](#3-render-novel-views)\n  * [Using your own poses without running COLMAP](#using-your-own-poses-without-running-colmap)\n  * [Troubleshooting](#troubleshooting)\n  * [Citation](#citation)\n\n## Installation TL;DR: Setup and render a demo scene\n\nFirst install `docker` ([instructions](https://docs.docker.com/install/linux/docker-ce/ubuntu/)) and `nvidia-docker` ([instructions](https://github.com/NVIDIA/nvidia-docker)).\n\nRun this in the base directory to download a pretrained checkpoint, download a Docker image, and run code to generate MPIs and a rendered output video on an example input dataset:\n```\nbash download_data.sh\nsudo docker pull bmild/tf_colmap\nsudo docker tag bmild/tf_colmap tf_colmap\nsudo nvidia-docker run --rm --volume /:/host --workdir /host$PWD tf_colmap bash demo.sh\n```\nA video like this should be output to `data/testscene/outputs/test_vid.mp4`:  \n\u003c/br\u003e\n\u003cimg src='imgs/fern.gif'/\u003e\n\nIf this works, then you are ready to start processing your own images! Run\n```\nsudo nvidia-docker run -it --rm --volume /:/host --workdir /host$PWD tf_colmap\n```\nto enter a shell inside the Docker container, and [skip ahead](#using-your-own-input-images-for-view-synthesis) to the section on using your own input images for view synthesis.\n\n## Full Installation Details\n\nYou can either install the prerequisites by hand or use our provided Dockerfile to make a docker image.\n\nIn either case, start by downloading this repository, then running the `download_data.sh` script to download a pretrained model and example input dataset:\n```\nbash download_data.sh\n```\nAfter installing dependencies, try running `bash demo.sh` from the base directory. (If using Docker, run this inside the container.) This should generate the video shown in the *Installation TL;DR* section at `data/testscene/outputs/test_vid.mp4`.\n\n### Manual installation\n\n- Install CUDA, Tensorflow, COLMAP, ffmpeg\n- Install the required Python packages:\n```\npip install -r requirements.txt\n```\n- Optional: run `make` in `cuda_renderer/` directory.\n- Optional: run `make` in `opengl_viewer/` directory. You may need to install GLFW or some other OpenGL libraries. For GLFW:\n```\nsudo apt-get install libglfw3-dev\n```\n\n\n### Docker installation\n\nTo build the docker image on your own machine, which may take 15-30 mins:\n```\nsudo docker build -t tf_colmap:latest .\n```\nTo download the image (~6GB) instead:\n```\nsudo docker pull bmild/tf_colmap\nsudo docker tag bmild/tf_colmap tf_colmap\n```\n\nAfterwards, you can launch an interactive shell inside the container:\n```\nsudo nvidia-docker run -it --rm --volume /:/host --workdir /host$PWD tf_colmap\n```\nFrom this shell, all the code in the repo should work (except `opengl_viewer`).\n\nTo run any single command `\u003ccommand...\u003e` inside the docker container:\n```\nsudo nvidia-docker run --rm --volume /:/host --workdir /host$PWD tf_colmap \u003ccommand...\u003e\n```\n\n\n## Using your own input images for view synthesis\n\n\u003cimg src='imgs/capture.gif'/\u003e\n\nOur method takes in a set of images of a static scene, promotes each image to a local layered representation (MPI), and blends local light fields rendered from these MPIs to render novel views. Please see our paper for more details. \n\nAs a rule of thumb, you should use images where the maximum disparity between views is no more than about 64 pixels (watch the closest thing to the camera and don't let it move more than ~1/8 the horizontal field of view between images). Our datasets usually consist of 20-30 images captured handheld in a rough grid pattern.\n\n#### Quickstart: rendering a video from a zip file of your images\n\nYou can quickly render novel view frames and a .mp4 video from a zip file of your captured input images with the `zip2mpis.sh` bash script. \n```\nbash zip2mpis.sh \u003czipfile\u003e \u003cyour_outdir\u003e [--height HEIGHT]\n```\n`height` is the output height in pixels. We recommend using a height of 360 pixels for generating results quickly.\n\n## General step-by-step usage\n\nBegin by creating a base scene directory (e.g., `scenedir/`), and copying your images into a subdirectory called `images/` (e.g., `scenedir/images`).\n\n#### 1. Recover camera poses\n\nThis script calls COLMAP to run structure from motion to get 6-DoF camera poses and near/far depth bounds for the scene.\n```\npython imgs2poses.py \u003cyour_scenedir\u003e\n```\n\n#### 2. Generate MPIs\n\nThis script uses our pretrained Tensorflow graph (make sure it exists in `checkpoints/papermodel`) to generate MPIs from the posed images. They will be saved in `\u003cyour_mpidir\u003e`, a directory will be created by the script.\n```\npython imgs2mpis.py \u003cyour_scenedir\u003e \u003cyour_mpidir\u003e \\\n    [--checkpoint CHECKPOINT] \\\n    [--factor FACTOR] [--width WIDTH] [--height HEIGHT] [--numplanes NUMPLANES] \\\n    [--disps] [--psvs] \n```\nYou should set at most one of `factor`, `width`, or `height` to determine the output MPI resolution (factor will scale the input image size down an integer factor, eg. 2, 4, 8, and height/width directly scale the input images to have the specified height or width). `numplanes` is 32 by default. `checkpoint` is set to the downloaded checkpoint by default.\n\nExample usage:\n```\npython imgs2mpis.py scenedir scenedir/mpis --height 360\n```\n\n#### 3. Render novel views\n\nYou can either generate a list of novel view camera poses and render out a video, or you can load the saved MPIs in our interactive OpenGL viewer.\n\n#### Generate poses for new view path\nFirst, generate a smooth new view path by calling\n```\npython imgs2renderpath.py \u003cyour_scenedir\u003e \u003cyour_posefile\u003e \\\n\t[--x_axis] [--y_axis] [--z_axis] [--circle][--spiral]\n```\n`\u003cyour_posefile\u003e` is the path of an output .txt file that will be created by the script, and will contain camera poses for the rendered novel views. The five optional arguments specify the trajectory of the camera. The xyz-axis options are straight lines along each camera axis respectively, \"circle\" is a circle in the camera plane, and \"spiral\" is a circle combined with movement along the z-axis.  \n\nExample usage:\n```\npython imgs2renderpath.py scenedir scenedir/spiral_path.txt --spiral\n```\nSee `llff/math/pose_math.py` for the code that generates these path trajectories.\n\n#### Render video with CUDA\nYou can build this in the `cuda_renderer/` directory by calling `make`.\n\nUses CUDA to render out a video. Specify the height of the output video in pixels (-1 for same resolution as the MPIs), the factor for cropping the edges of the video (default is 1.0 for no cropping), and the compression quality (crf) for the saved MP4 file (default is 18, lossless is 0, reasonable is 12-28).\n```\n./cuda_renderer mpidir \u003cyour_posefile\u003e \u003cyour_videofile\u003e height crop crf\n```\n`\u003cyour_videofile\u003e` is the path to the video file that will be written by FFMPEG.\n\nExample usage:\n```\n./cuda_renderer scenedir/mpis scenedir/spiral_path.txt scenedir/spiral_render.mp4 -1 0.8 18\n```\n\n\n#### Render video with Tensorflow\nUse Tensorflow to render out a video (~100x slower than CUDA renderer). Optionally, specify how many MPIs are blended for each rendered output (default is 5) and what factor to crop the edges of the video (default is 1.0 for no cropping).\n```\npython mpis2video.py \u003cyour_mpidir\u003e \u003cyour_posefile\u003e videofile [--use_N USE_N] [--crop_factor CROP_FACTOR]\n```\nExample usage:\n```\npython mpis2video.py scenedir/mpis scenedir/spiral_path.txt scenedir/spiral_render.mp4 --crop_factor 0.8\n```\n\n\n#### Interactive OpenGL viewer\n\nControls:\n- ESC to quit\n- Move mouse to translate in camera plane\n- Click and drag to rotate camera\n- Scroll to change focal length (zoom)\n- 'L' to animate circle render path\n\n\u003cimg src='imgs/viewer.gif'/\u003e\n\n*The OpenGL viewer cannot be used in the Docker container.*\n\nYou need OpenGL installed, particularly GLFW:\n```\nsudo apt-get install libglfw3-dev\n```\n\nYou can build the viewer in the `opengl_viewer/` directory by calling `make`.  \n\nGeneral usage (in `opengl_viewer/` directory) is \n```\n./opengl_viewer mpidir\n```\n\n## Using your own poses without running COLMAP\n\nHere we explain the `poses_bounds.npy` file format. This file stores a numpy array of size Nx17 (where N is the number of input images). You can see how it is loaded in the [three lines here](https://github.com/Fyusion/LLFF/blob/master/llff/poses/pose_utils.py#L195). Each row of length 17 gets reshaped into a 3x5 pose matrix and 2 depth values that bound the closest and farthest scene content from that point of view.\n\nThe pose matrix is a 3x4 camera-to-world affine transform concatenated with a 3x1 column `[image height, image width, focal length]` to represent the intrinsics (we assume the principal point is centered and that the focal length is the same for both x and y).\n\nThe right-handed coordinate system of the the rotation (first 3x3 block in the camera-to-world transform) is as follows: from the point of view of the camera, the three axes are \n`[down, right, backwards]`\nwhich some people might consider to be `[-y,x,z]`, where the camera is looking along `-z`. (The more conventional frame `[x,y,z]` is `[right, up, backwards]`. The COLMAP frame is `[right, down, forwards]` or `[x,-y,-z]`.)\n\nIf you have a set of 3x4 cam-to-world poses for your images plus focal lengths and close/far depth bounds, the steps to recreate `poses_bounds.npy` are:\n\n1. Make sure your poses are in camera-to-world format, not world-to-camera.\n2. Make sure your rotation matrices have the columns in the correct coordinate frame `[down, right, backwards]`.\n3. Concatenate each pose with the `[height, width, focal]` intrinsics vector to get a 3x5 matrix.\n4. Flatten each of those into 15 elements and concatenate the close and far depths.\n5. Stack the 17-d vectors to get a Nx17 matrix and use `np.save` to store it as `poses_bounds.npy` in the scene's base directory (same level containing the `images/` directory).\n\nThis should explain the [pose processing after COLMAP](https://github.com/Fyusion/LLFF/blob/master/llff/poses/pose_utils.py#L11). \n\n\n\n## Troubleshooting\n\n- __`PyramidCU::GenerateFeatureList: an illegal memory access was encountered`__:\nSome machine configurations might run into problems running the script `imgs2poses.py`.\nA solution to that would be to set the environment variable `CUDA_VISIBLE_DEVICES`. If the issue persists, try uncommenting [this line](https://github.com/Fyusion/LLFF/blob/master/llff/poses/colmap_wrapper.py#L33) to stop COLMAP from using the GPU to extract image features.\n- __Black screen__:\nIn the latest versions of MacOS, OpenGL initializes a context with a black screen until the window is dragged or resized. If you run into this problem, please drag the window to another position.\n- __COLMAP fails__: If you see \"Could not register, trying another image\", you will probably have to try changing COLMAP optimization parameters or capturing more images of your scene. See [here](https://github.com/Fyusion/LLFF/issues/8#issuecomment-498514411).\n\n\n## Citation\n\nIf you find this useful for your research, please cite the following paper.\n\n```\n@article{mildenhall2019llff,\n  title={Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines},\n  author={Ben Mildenhall and Pratul P. Srinivasan and Rodrigo Ortiz-Cayon and Nima Khademi Kalantari and Ravi Ramamoorthi and Ren Ng and Abhishek Kar},\n  journal={ACM Transactions on Graphics (TOG)},\n  year={2019},\n}\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffyusion%2Fllff","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffyusion%2Fllff","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffyusion%2Fllff/lists"}