{"id":15039234,"url":"https://github.com/sunset1995/directvoxgo","last_synced_at":"2025-04-12T19:50:47.801Z","repository":{"id":37207637,"uuid":"416591486","full_name":"sunset1995/DirectVoxGO","owner":"sunset1995","description":"Direct voxel grid optimization for fast radiance field reconstruction.","archived":false,"fork":false,"pushed_at":"2023-05-15T23:01:32.000Z","size":4966,"stargazers_count":1063,"open_issues_count":47,"forks_count":110,"subscribers_count":23,"default_branch":"main","last_synced_at":"2025-04-11T00:34:08.350Z","etag":null,"topics":["cvpr2022","directvoxgo","dvgo","nerf","neural-radiance-fields"],"latest_commit_sha":null,"homepage":"https://sunset1995.github.io/dvgo","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sunset1995.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-10-13T04:43:11.000Z","updated_at":"2025-04-05T16:04:13.000Z","dependencies_parsed_at":"2024-10-01T01:41:24.944Z","dependency_job_id":null,"html_url":"https://github.com/sunset1995/DirectVoxGO","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sunset1995%2FDirectVoxGO","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sunset1995%2FDirectVoxGO/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sunset1995%2FDirectVoxGO/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sunset1995%2FDirectVoxGO/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sunset1995","download_url":"https://codeload.github.com/sunset1995/DirectVoxGO/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248625501,"owners_count":21135513,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cvpr2022","directvoxgo","dvgo","nerf","neural-radiance-fields"],"created_at":"2024-09-24T20:42:03.333Z","updated_at":"2025-04-12T19:50:47.764Z","avatar_url":"https://github.com/sunset1995.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DirectVoxGO\n\nDirect Voxel Grid Optimization (CVPR2022 Oral, [project page](https://sunset1995.github.io/dvgo/), [DVGO paper](https://arxiv.org/abs/2111.11215), [DVGO v2 paper](https://arxiv.org/abs/2206.05085)).\n\nhttps://user-images.githubusercontent.com/2712505/153380311-19d6c3a1-9130-489a-af16-ad36c78f10a9.mp4\n\nhttps://user-images.githubusercontent.com/2712505/153380197-991d1689-6418-499c-a192-d757f9a64b64.mp4\n\n### Custom casual capturing\nA [short guide](https://sunset1995.github.io/dvgo/tutor_forward_facing.html) to capture custom forward-facing scenes and rendering fly-through videos.\n\nBelow are two rgb and depth fly-through videos from custom captured scenes.\n\nhttps://user-images.githubusercontent.com/2712505/174267754-619d4f81-dd04-4c50-ba7f-434774cb890e.mp4\n\n### Features\n- Speedup NeRF by replacing the MLP with the voxel grid.\n- Simple scene representation:\n    - *Volume densities*: dense voxel grid (3D).\n    - *View-dependent colors*: dense feature grid (4D) + shallow MLP.\n- Pytorch cuda extention built just-in-time for another 2--3x speedup.\n- O(N) realization for the distortion loss proposed by [mip-nerf 360](https://jonbarron.info/mipnerf360/).\n    - The loss improves our training time and quality.\n    - We have released a self-contained pytorch package: [torch_efficient_distloss](https://github.com/sunset1995/torch_efficient_distloss).\n    - Consider a batch of 8192 rays X 256 points.\n        - GPU memory consumption: 6192MB =\u003e 96MB.\n        - Run times for 100 iters: 20 sec =\u003e 0.2sec.\n- Supported datasets:\n    - *Bounded inward-facing*: [NeRF](https://drive.google.com/drive/folders/128yBriW1IG_3NJ5Rp7APSTZsJqdJdfc1), [NSVF](https://dl.fbaipublicfiles.com/nsvf/dataset/Synthetic_NSVF.zip), [BlendedMVS](https://dl.fbaipublicfiles.com/nsvf/dataset/BlendedMVS.zip), [T\u0026T (masked)](https://dl.fbaipublicfiles.com/nsvf/dataset/TanksAndTemple.zip), [DeepVoxels](https://drive.google.com/open?id=1ScsRlnzy9Bd_n-xw83SP-0t548v63mPH).\n    - *Unbounded inward-facing*: [T\u0026T](https://drive.google.com/file/d/11KRfN91W1AxAW6lOFs4EeYDbeoQZCi87/view?usp=sharing), [LF](https://drive.google.com/file/d/1gsjDjkbTh4GAR9fFqlIDZ__qR9NYTURQ/view?usp=sharing), [mip-NeRF360](https://jonbarron.info/mipnerf360/).\n    - *Foward-facing*: [LLFF](https://drive.google.com/drive/folders/14boI-o5hGO9srnWaaogTU5_ji7wkX2S7).\n\n\n### Installation\n```\ngit clone git@github.com:sunset1995/DirectVoxGO.git\ncd DirectVoxGO\npip install -r requirements.txt\n```\n[Pytorch](https://pytorch.org/) and [torch_scatter](https://github.com/rusty1s/pytorch_scatter) installation is machine dependent, please install the correct version for your machine.\n\n\u003cdetails\u003e\n  \u003csummary\u003e Dependencies (click to expand) \u003c/summary\u003e\n\n  - `PyTorch`, `numpy`, `torch_scatter`: main computation.\n  - `scipy`, `lpips`: SSIM and LPIPS evaluation.\n  - `tqdm`: progress bar.\n  - `mmcv`: config system.\n  - `opencv-python`: image processing.\n  - `imageio`, `imageio-ffmpeg`: images and videos I/O.\n  - `Ninja`: to build the newly implemented torch extention just-in-time.\n  - `einops`: torch tensor shaping with pretty api.\n  - `torch_efficient_distloss`: O(N) realization for the distortion loss.\n\u003c/details\u003e\n\n\n## Directory structure for the datasets\n\n\u003cdetails\u003e\n  \u003csummary\u003e (click to expand;) \u003c/summary\u003e\n\n    data\n    ├── nerf_synthetic     # Link: https://drive.google.com/drive/folders/128yBriW1IG_3NJ5Rp7APSTZsJqdJdfc1\n    │   └── [chair|drums|ficus|hotdog|lego|materials|mic|ship]\n    │       ├── [train|val|test]\n    │       │   └── r_*.png\n    │       └── transforms_[train|val|test].json\n    │\n    ├── Synthetic_NSVF     # Link: https://dl.fbaipublicfiles.com/nsvf/dataset/Synthetic_NSVF.zip\n    │   └── [Bike|Lifestyle|Palace|Robot|Spaceship|Steamtrain|Toad|Wineholder]\n    │       ├── intrinsics.txt\n    │       ├── rgb\n    │       │   └── [0_train|1_val|2_test]_*.png\n    │       └── pose\n    │           └── [0_train|1_val|2_test]_*.txt\n    │\n    ├── BlendedMVS         # Link: https://dl.fbaipublicfiles.com/nsvf/dataset/BlendedMVS.zip\n    │   └── [Character|Fountain|Jade|Statues]\n    │       ├── intrinsics.txt\n    │       ├── rgb\n    │       │   └── [0|1|2]_*.png\n    │       └── pose\n    │           └── [0|1|2]_*.txt\n    │\n    ├── TanksAndTemple     # Link: https://dl.fbaipublicfiles.com/nsvf/dataset/TanksAndTemple.zip\n    │   └── [Barn|Caterpillar|Family|Ignatius|Truck]\n    │       ├── intrinsics.txt\n    │       ├── rgb\n    │       │   └── [0|1|2]_*.png\n    │       └── pose\n    │           └── [0|1|2]_*.txt\n    │\n    ├── deepvoxels         # Link: https://drive.google.com/drive/folders/1ScsRlnzy9Bd_n-xw83SP-0t548v63mPH\n    │   └── [train|validation|test]\n    │       └── [armchair|cube|greek|vase]\n    │           ├── intrinsics.txt\n    │           ├── rgb/*.png\n    │           └── pose/*.txt\n    │\n    ├── nerf_llff_data     # Link: https://drive.google.com/drive/folders/128yBriW1IG_3NJ5Rp7APSTZsJqdJdfc1\n    │   └── [fern|flower|fortress|horns|leaves|orchids|room|trex]\n    │\n    ├── tanks_and_temples  # Link: https://drive.google.com/file/d/11KRfN91W1AxAW6lOFs4EeYDbeoQZCi87/view?usp=sharing\n    │   └── [tat_intermediate_M60|tat_intermediate_Playground|tat_intermediate_Train|tat_training_Truck]\n    │       └── [train|test]\n    │           ├── intrinsics/*txt\n    │           ├── pose/*txt\n    │           └── rgb/*jpg\n    │\n    ├── lf_data            # Link: https://drive.google.com/file/d/1gsjDjkbTh4GAR9fFqlIDZ__qR9NYTURQ/view?usp=sharing\n    │   └── [africa|basket|ship|statue|torch]\n    │       └── [train|test]\n    │           ├── intrinsics/*txt\n    │           ├── pose/*txt\n    │           └── rgb/*jpg\n    │\n    ├── 360_v2             # Link: https://jonbarron.info/mipnerf360/\n    │   └── [bicycle|bonsai|counter|garden|kitchen|room|stump]\n    │       ├── poses_bounds.npy\n    │       └── [images_2|images_4]\n    │\n    ├── nerf_llff_data     # Link: https://drive.google.com/drive/folders/14boI-o5hGO9srnWaaogTU5_ji7wkX2S7\n    │   └── [fern|flower|fortress|horns|leaves|orchids|room|trex]\n    │       ├── poses_bounds.npy\n    │       └── [images_2|images_4]\n    │\n    └── co3d               # Link: https://github.com/facebookresearch/co3d\n        └── [donut|teddybear|umbrella|...]\n            ├── frame_annotations.jgz\n            ├── set_lists.json\n            └── [129_14950_29917|189_20376_35616|...]\n                ├── images\n                │   └── frame*.jpg\n                └── masks\n                    └── frame*.png\n\u003c/details\u003e\n\n\n\n## GO\n\n- Training\n    ```bash\n    $ python run.py --config configs/nerf/lego.py --render_test\n    ```\n    Use `--i_print` and `--i_weights` to change the log interval.\n- Evaluation\n    To only evaluate the testset `PSNR`, `SSIM`, and `LPIPS` of the trained `lego` without re-training, run:\n    ```bash\n    $ python run.py --config configs/nerf/lego.py --render_only --render_test \\\n                                                  --eval_ssim --eval_lpips_vgg\n    ```\n    Use `--eval_lpips_alex` to evaluate LPIPS with pre-trained Alex net instead of VGG net.\n- Render video\n    ```bash\n    $ python run.py --config configs/nerf/lego.py --render_only --render_video\n    ```\n    Use `--render_video_factor 4` for a fast preview.\n- Reproduction: all config files to reproduce our results.\n    \u003cdetails\u003e\n        \u003csummary\u003e (click to expand) \u003c/summary\u003e\n\n        $ ls configs/*\n        configs/blendedmvs:\n        Character.py  Fountain.py  Jade.py  Statues.py\n\n        configs/nerf:\n        chair.py  drums.py  ficus.py  hotdog.py  lego.py  materials.py  mic.py  ship.py\n\n        configs/nsvf:\n        Bike.py  Lifestyle.py  Palace.py  Robot.py  Spaceship.py  Steamtrain.py  Toad.py  Wineholder.py\n\n        configs/tankstemple:\n        Barn.py  Caterpillar.py  Family.py  Ignatius.py  Truck.py\n\n        configs/deepvoxels:\n        armchair.py  cube.py  greek.py  vase.py\n\n        configs/tankstemple_unbounded:\n        M60.py  Playground.py  Train.py  Truck.py\n\n        configs/lf:\n        africa.py  basket.py  ship.py  statue.py  torch.py\n\n        configs/nerf_unbounded:\n        bicycle.py  bonsai.py  counter.py  garden.py  kitchen.py  room.py  stump.py\n\n        configs/llff:\n        fern.py  flower.py  fortress.py  horns.py  leaves.py  orchids.py  room.py  trex.py\n    \u003c/details\u003e\n\n### Custom casually captured scenes\nComing soon hopefully.\n\n### Development and tuning guide\n#### Extention to new dataset\nAdjusting the data related config fields to fit your camera coordinate system is recommend before implementing a new one.\nWe provide two visualization tools for debugging.\n1. Inspect the camera and the allocated BBox.\n    - Export via `--export_bbox_and_cams_only {filename}.npz`:\n      ```bash\n      python run.py --config configs/nerf/mic.py --export_bbox_and_cams_only cam_mic.npz\n      ```\n    - Visualize the result:\n      ```bash\n      python tools/vis_train.py cam_mic.npz\n      ```\n2. Inspect the learned geometry after coarse optimization.\n    - Export via `--export_coarse_only {filename}.npz` (assumed `coarse_last.tar` available in the train log):\n      ```bash\n      python run.py --config configs/nerf/mic.py --export_coarse_only coarse_mic.npz\n      ```\n    - Visualize the result:\n      ```bash\n      python tools/vis_volume.py coarse_mic.npz 0.001 --cam cam_mic.npz\n      ```\n\n| Inspecting the cameras \u0026 BBox | Inspecting the learned coarse volume |\n|:-:|:-:|\n|![](figs/debug_cam_and_bbox.png)|![](figs/debug_coarse_volume.png)|\n\n\n\n#### Speed and quality tradeoff\nWe have reported some ablation experiments in our paper supplementary material.\nSetting `N_iters`, `N_rand`, `num_voxels`, `rgbnet_depth`, `rgbnet_width` to larger values or setting `stepsize` to smaller values typically leads to better quality but need more computation.\nThe `weight_distortion` affects the training speed and quality as well.\nOnly `stepsize` is tunable in testing phase, while all the other fields should remain the same as training.\n\n## Advanced data structure\n- **Octree** — [Plenoxels: Radiance Fields without Neural Networks](https://alexyu.net/plenoxels/).\n- **Hash** — [Instant Neural Graphics Primitives with a Multiresolution Hash Encoding](https://nvlabs.github.io/instant-ngp/).\n- **Factorized components** — [TensoRF: Tensorial Radiance Fields](https://apchenstu.github.io/TensoRF/).\n\nYou will need them for scaling to a higher grid resolution. But we believe our simplest dense grid could still be your good starting point if you have other challenging problems to deal with.\n\n## Acknowledgement\nThe code base is origined from an awesome [nerf-pytorch](https://github.com/yenchenlin/nerf-pytorch) implementation, but it becomes very different from the code base now.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsunset1995%2Fdirectvoxgo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsunset1995%2Fdirectvoxgo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsunset1995%2Fdirectvoxgo/lists"}