{"id":13778652,"url":"https://github.com/facebookresearch/NSVF","last_synced_at":"2025-05-11T12:31:20.178Z","repository":{"id":40957396,"uuid":"203693380","full_name":"facebookresearch/NSVF","owner":"facebookresearch","description":"Open source code for the paper of Neural Sparse Voxel Fields.","archived":true,"fork":false,"pushed_at":"2023-07-06T22:01:38.000Z","size":7896,"stargazers_count":796,"open_issues_count":31,"forks_count":92,"subscribers_count":58,"default_branch":"main","last_synced_at":"2024-08-03T18:13:10.190Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/facebookresearch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2019-08-22T01:45:06.000Z","updated_at":"2024-07-30T11:55:09.000Z","dependencies_parsed_at":"2024-01-16T08:12:13.134Z","dependency_job_id":"aa05615f-f5b1-4e59-bc92-1c9e14c7f8ef","html_url":"https://github.com/facebookresearch/NSVF","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2FNSVF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2FNSVF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2FNSVF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2FNSVF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/facebookresearch","download_url":"https://codeload.github.com/facebookresearch/NSVF/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225048975,"owners_count":17412902,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T18:00:55.721Z","updated_at":"2024-11-17T14:30:45.083Z","avatar_url":"https://github.com/facebookresearch.png","language":"Python","funding_links":[],"categories":["Hybrid implicit / explicit (condition implicit on local features)","Python"],"sub_categories":["For dynamic scenes"],"readme":"# Neural Sparse Voxel Fields (NSVF)\n\n### [Project Page](https://lingjie0206.github.io/papers/NSVF/) | [Video](https://www.youtube.com/watch?v=RFqPwH7QFEI) | [Paper](https://arxiv.org/abs/2007.11571) | [Data](#dataset)\n\n\u003cimg src='docs/figs/framework.png'/\u003e\n\nPhoto-realistic free-viewpoint rendering of real-world scenes using classical computer graphics techniques is a challenging problem because it requires the difficult step of capturing detailed appearance and geometry models.\nNeural rendering is an emerging field that employs deep neural networks to implicitly learn scene representations encapsulating both geometry and appearance from 2D observations with or without a coarse geometry.\nHowever, existing approaches in this field often show blurry renderings or suffer from slow rendering process. We propose [Neural Sparse Voxel Fields (NSVF)](https://arxiv.org/abs/2007.11571), a new neural scene representation for fast and high-quality free-viewpoint rendering.\n\nHere is the official repo for the paper:\n\n* [Neural Sparse Voxel Fields (Liu et al., 2020, \u003cspan style=\"color:red\"\u003eNeurIPS 2020 Spotlight\u003c/span\u003e)](https://arxiv.org/abs/2007.11571).\n\nWe also provide our unofficial implementation for:\n* [NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (Mildenhall et al., 2020)](https://arxiv.org/pdf/2003.08934.pdf).\n\n\n## Table of contents\n-----\n  * [Installation](#requirements-and-installation)\n  * [Dataset](#dataset)\n  * [Usage](#train-a-new-model)\n    + [Training](#train-a-new-model)\n    + [Evaluation](#evaluation)\n    + [Free-view Rendering](#free-viewpoint-rendering)\n    + [Extracting Geometry](#extract-the-geometry)\n  * [License](#license)\n  * [Citation](#citation)\n------\n\n## Requirements and Installation\n\nThis code is implemented in PyTorch using [fairseq framework](https://github.com/pytorch/fairseq).\n\nThe code has been tested on the following system:\n\n* Python 3.7\n* PyTorch 1.4.0\n* [Nvidia apex library](https://github.com/NVIDIA/apex) (optional)\n* Nvidia GPU (Tesla V100 32GB) CUDA 10.1\n\nOnly learning and rendering on GPUs are supported.\n\nTo install, first clone this repo and install all dependencies:\n\n```bash\npip install -r requirements.txt\n```\n\nThen,  run\n\n```bash\npip install --editable ./\n```\n\nOr if you want to install the code locally, run:\n\n```bash\npython setup.py build_ext --inplace\n```\n\n## Dataset\n\nYou can download the pre-processed synthetic and real datasets used in our paper.\nPlease also cite the original papers if you use any of them in your work.\n\nDataset | Download Link | Notes on Dataset Split\n---|---|---\nSynthetic-NSVF | [download (.zip)](https://dl.fbaipublicfiles.com/nsvf/dataset/Synthetic_NSVF.zip) | 0_\\* (training) 1_\\* (validation) 2_\\* (testing)\n[Synthetic-NeRF](https://github.com/bmild/nerf) | [download (.zip)](https://dl.fbaipublicfiles.com/nsvf/dataset/Synthetic_NeRF.zip) | 0_\\* (training) 1_\\* (validation) 2_\\* (testing)\n[BlendedMVS](https://github.com/YoYo000/BlendedMVS)  | [download (.zip)](https://dl.fbaipublicfiles.com/nsvf/dataset/BlendedMVS.zip) | 0_\\* (training) 1_\\* (testing)\n[Tanks\u0026Temples](https://www.tanksandtemples.org/) | [download (.zip)](https://dl.fbaipublicfiles.com/nsvf/dataset/TanksAndTemple.zip) | 0_\\* (training) 1_\\* (testing)\n\n### Prepare your own dataset\n\nTo prepare a new dataset of a single scene for training and testing, please follow the data structure:\n\n```bash\n\u003cdataset_name\u003e\n|-- bbox.txt         # bounding-box file\n|-- intrinsics.txt   # 4x4 camera intrinsics\n|-- rgb\n    |-- 0.png        # target image for each view\n    |-- 1.png\n    ...\n|-- pose\n    |-- 0.txt        # camera pose for each view (4x4 matrices)\n    |-- 1.txt\n    ...\n[optional]\n|-- test_traj.txt    # camera pose for free-view rendering demonstration (4N x 4)\n```\n\nwhere the ``bbox.txt`` file contains a line describing the initial bounding box and voxel size:\n\n```bash\nx_min y_min z_min x_max y_max z_max initial_voxel_size\n```\n\nNote that the file names of target images and those of the corresponding camera pose files are not required to be exactly the same. However, the orders of these two kinds of files (sorted by string) must match.  The datasets are split with view indices.\nFor example, \"``train (0..100)``, ``valid (100..200)`` and ``test (200..400)``\" mean the first 100 views for training, 100-199th views for validation, and 200-399th views for testing.\n\n## Train a new model\n\nGiven the dataset of a single scene (``{DATASET}``), we use the following command for training an NSVF model to synthesize novel views at ``800x800`` pixels, with a batch size of ``4`` images per GPU and ``2048`` rays per image. By default, the code will automatically detect all available GPUs.\n\nIn the following example, we use a pre-defined architecture ``nsvf_base`` with specific arguments:\n\n* By setting ``--no-sampling-at-reader``, the model only samples pixels in the projected image region of sparse voxels for training.\n* By default, we set the ray-marching step size to be the ratio ``1/8 (0.125)`` of the voxel size which is typically described in the ``bbox.txt`` file.\n* It is optional to turn on ``--use-octree``. It will build a sparse voxel octree to speed-up the ray-voxel intersection especially when the number of voxels is larger than ``10000``.\n* By setting ``--pruning-every-steps`` as ``2500``, the model performs self-pruning at every ``2500`` steps.\n* By setting ``--half-voxel-size-at`` and ``--reduce-step-size-at`` as ``5000,25000,75000``,  the voxel size and step size are halved at ``5k``, ``25k`` and ``75k``, respectively.\n\nNote that, although above parameter settings are used for most of the experiments in the paper, it is possible to tune these parameters to achieve better quality. Besides the above parameters, other parameters can also use default settings.\n\nBesides the architecture ``nsvf_base``, you may check other architectures or define your own architectures in the file ``fairnr/models/nsvf.py``.\n\n```bash\npython -u train.py ${DATASET} \\\n    --user-dir fairnr \\\n    --task single_object_rendering \\\n    --train-views \"0..100\" --view-resolution \"800x800\" \\\n    --max-sentences 1 --view-per-batch 4 --pixel-per-view 2048 \\\n    --no-preload \\\n    --sampling-on-mask 1.0 --no-sampling-at-reader \\\n    --valid-views \"100..200\" --valid-view-resolution \"400x400\" \\\n    --valid-view-per-batch 1 \\\n    --transparent-background \"1.0,1.0,1.0\" --background-stop-gradient \\\n    --arch nsvf_base \\\n    --initial-boundingbox ${DATASET}/bbox.txt \\\n    --use-octree \\\n    --raymarching-stepsize-ratio 0.125 \\\n    --discrete-regularization \\\n    --color-weight 128.0 --alpha-weight 1.0 \\\n    --optimizer \"adam\" --adam-betas \"(0.9, 0.999)\" \\\n    --lr 0.001 --lr-scheduler \"polynomial_decay\" --total-num-update 150000 \\\n    --criterion \"srn_loss\" --clip-norm 0.0 \\\n    --num-workers 0 \\\n    --seed 2 \\\n    --save-interval-updates 500 --max-update 150000 \\\n    --virtual-epoch-steps 5000 --save-interval 1 \\\n    --half-voxel-size-at  \"5000,25000,75000\" \\\n    --reduce-step-size-at \"5000,25000,75000\" \\\n    --pruning-every-steps 2500 \\\n    --keep-interval-updates 5 --keep-last-epochs 5 \\\n    --log-format simple --log-interval 1 \\\n    --save-dir ${SAVE} \\\n    --tensorboard-logdir ${SAVE}/tensorboard \\\n    | tee -a $SAVE/train.log\n```\n\nThe checkpoints are saved in ``{SAVE}``. You can launch tensorboard to check training progress:\n\n```bash\ntensorboard --logdir=${SAVE}/tensorboard --port=10000\n```\n\nThere are more examples of training scripts to reproduce the results of our paper under [examples](./examples/train/).\n\n## Evaluation\n\nOnce the model is trained, the following command is used to evaluate rendering quality on the test views given the ``{MODEL_PATH}``.\n\n```bash\npython validate.py ${DATASET} \\\n    --user-dir fairnr \\\n    --valid-views \"200..400\" \\\n    --valid-view-resolution \"800x800\" \\\n    --no-preload \\\n    --task single_object_rendering \\\n    --max-sentences 1 \\\n    --valid-view-per-batch 1 \\\n    --path ${MODEL_PATH} \\\n    --model-overrides '{\"chunk_size\":512,\"raymarching_tolerance\":0.01,\"tensorboard_logdir\":\"\",\"eval_lpips\":True}' \\\n```\n\nNote that we override the ``raymarching_tolerance`` to ``0.01`` to enable early termination for rendering speed-up.\n\n## Free Viewpoint Rendering\n\nFree-viewpoint rendering can be achieved once a model is trained and a rendering trajectory is specified. For example, the following command is for rendering with a circle trajectory (angular speed 3 degree/frame, 15 frames per GPU). This outputs per-view rendered images and merge the images into a ``.mp4`` video in ``${SAVE}/output`` as follows:\n\n\u003cimg src='docs/figs/results.gif'/\u003e\n\nBy default, the code can detect all available GPUs.\n\n```bash\npython render.py ${DATASET} \\\n    --user-dir fairnr \\\n    --task single_object_rendering \\\n    --path ${MODEL_PATH} \\\n    --model-overrides '{\"chunk_size\":512,\"raymarching_tolerance\":0.01}' \\\n    --render-beam 1 --render-angular-speed 3 --render-num-frames 15 \\\n    --render-save-fps 24 \\\n    --render-resolution \"800x800\" \\\n    --render-path-style \"circle\" \\\n    --render-path-args \"{'radius': 3, 'h': 2, 'axis': 'z', 't0': -2, 'r':-1}\" \\\n    --render-output ${SAVE}/output \\\n    --render-output-types \"color\" \"depth\" \"voxel\" \"normal\" --render-combine-output \\\n    --log-format \"simple\"\n```\n\nOur code also supports rendering for given camera poses.\nFor instance, the following command is for rendering with the camera poses defined in the 200-399th files under folder ``${DATASET}/pose``:\n\n```bash\npython render.py ${DATASET} \\\n    --user-dir fairnr \\\n    --task single_object_rendering \\\n    --path ${MODEL_PATH} \\\n    --model-overrides '{\"chunk_size\":512,\"raymarching_tolerance\":0.01}' \\\n    --render-save-fps 24 \\\n    --render-resolution \"800x800\" \\\n    --render-camera-poses ${DATASET}/pose \\\n    --render-views \"200..400\" \\\n    --render-output ${SAVE}/output \\\n    --render-output-types \"color\" \"depth\" \"voxel\" \"normal\" --render-combine-output \\\n    --log-format \"simple\"\n```\n\nThe code also supports rendering with camera poses defined in a ``.txt`` file. Please refer to this [example](./examples/render/render_jade.sh).\n\n## Extract the Geometry\n\nWe also support running marching cubes to extract the iso-surfaces as triangle meshes from a trained NSVF model and saved as ``{SAVE}/{NAME}.ply``. \n```bash\npython extract.py \\\n    --user-dir fairnr \\\n    --path ${MODEL_PATH} \\\n    --output ${SAVE} \\\n    --name ${NAME} \\\n    --format 'mc_mesh' \\\n    --mc-threshold 0.5 \\\n    --mc-num-samples-per-halfvoxel 5\n```\nIt is also possible to export the learned sparse voxels by setting ``--format 'voxel_mesh'``.\nThe output ``.ply`` file can be opened with any 3D viewers such as [MeshLab](https://www.meshlab.net/). \n\n\u003cimg src='docs/figs/snapshot_meshlab.png'/\u003e\n\n## License\n\nNSVF is MIT-licensed.\nThe license applies to the pre-trained models as well.\n\n## Citation\n\nPlease cite as \n```bibtex\n@article{liu2020neural,\n  title={Neural Sparse Voxel Fields},\n  author={Liu, Lingjie and Gu, Jiatao and Lin, Kyaw Zaw and Chua, Tat-Seng and Theobalt, Christian},\n  journal={NeurIPS},\n  year={2020}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2FNSVF","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffacebookresearch%2FNSVF","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2FNSVF/lists"}