{"id":27864487,"url":"https://github.com/lakonik/ssdnerf","last_synced_at":"2025-05-04T21:06:10.233Z","repository":{"id":176507465,"uuid":"617363723","full_name":"Lakonik/SSDNeRF","owner":"Lakonik","description":"[ICCV 2023] Single-Stage Diffusion NeRF","archived":false,"fork":false,"pushed_at":"2024-04-20T16:23:47.000Z","size":7836,"stargazers_count":443,"open_issues_count":22,"forks_count":25,"subscribers_count":18,"default_branch":"main","last_synced_at":"2025-05-04T21:06:02.481Z","etag":null,"topics":["3d-reconstruction","diffusion-models","generative-model","iccv","nerf"],"latest_commit_sha":null,"homepage":"https://lakonik.github.io/ssdnerf/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Lakonik.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-22T08:32:53.000Z","updated_at":"2025-03-04T05:50:46.000Z","dependencies_parsed_at":null,"dependency_job_id":"9c4c402c-8e6d-42ae-98f9-03dd26af26e0","html_url":"https://github.com/Lakonik/SSDNeRF","commit_stats":null,"previous_names":["lakonik/ssdnerf"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Lakonik%2FSSDNeRF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Lakonik%2FSSDNeRF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Lakonik%2FSSDNeRF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Lakonik%2FSSDNeRF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Lakonik","download_url":"https://codeload.github.com/Lakonik/SSDNeRF/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252399542,"owners_count":21741672,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d-reconstruction","diffusion-models","generative-model","iccv","nerf"],"created_at":"2025-05-04T21:06:09.525Z","updated_at":"2025-05-04T21:06:10.223Z","avatar_url":"https://github.com/Lakonik.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"📢 **NEWS:** We have released [MVEdit](https://github.com/Lakonik/MVEdit), an upgraded codebase based on SSDNeRF. MVEdit supports all SSDNeRF models and configs, and offers new features such as diffusers support and improved SSDNeRF GUI.\n\n# SSDNeRF\n\nOfficial PyTorch implementation of the ICCV 2023 paper:\n\n**Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction**\n\u003cbr\u003e\n[Hansheng Chen](https://lakonik.github.io/)\u003csup\u003e1,\u003c/sup\u003e\\*, [Jiatao Gu](https://jiataogu.me/)\u003csup\u003e2\u003c/sup\u003e, [Anpei Chen](https://apchenstu.github.io/)\u003csup\u003e3\u003c/sup\u003e, [Wei Tian](https://scholar.google.com/citations?user=aYKQn88AAAAJ\u0026hl=en)\u003csup\u003e1\u003c/sup\u003e, [Zhuowen Tu](https://pages.ucsd.edu/~ztu/)\u003csup\u003e4\u003c/sup\u003e, [Lingjie Liu](https://lingjie0206.github.io/)\u003csup\u003e5\u003c/sup\u003e, [Hao Su](https://cseweb.ucsd.edu/~haosu/)\u003csup\u003e4\u003c/sup\u003e\u003cbr\u003e\n\u003csup\u003e1\u003c/sup\u003eTongji University, \u003csup\u003e2\u003c/sup\u003eApple, \u003csup\u003e3\u003c/sup\u003eETH Zürich, \u003csup\u003e4\u003c/sup\u003eUCSD, \u003csup\u003e5\u003c/sup\u003eUniversity of Pennsylvania\n\u003cbr\u003e\n\\*Work done during a remote internship with UCSD.\n\n[[project page](https://lakonik.github.io/ssdnerf)] [[paper](https://arxiv.org/pdf/2304.06714.pdf)]\n\nPart of this codebase is based on [torch-ngp](https://github.com/ashawkey/torch-ngp) and [MMGeneration](https://github.com/open-mmlab/mmgeneration).\n\u003cbr\u003e\n\nhttps://github.com/Lakonik/SSDNeRF/assets/53893837/22e7ee6c-7576-44f2-b408-41089180e359\n\n## Highlights\n\n- Code to reproduce ALL the experiments in the paper and supplementary material (including single-view reconstruction on the real KITTI Cars dataset).\n\u003cbr\u003e\u003cimg src=\"ssdnerf_kitti.gif\" width=\"500\" alt=\"\"/\u003e\n- New features including support for tiled triplanes (rollout layout), FP16 diffusion sampling, and 16-bit caching.\n- A simple GUI demo (modified from [torch-ngp](https://github.com/ashawkey/torch-ngp)).\n\u003cbr\u003e\u003cimg src=\"ssdnerf_gui.png\" width=\"500\" alt=\"\"/\u003e\n\n## Installation\n\n### Prerequisites\n\nThe code has been tested in the environment described as follows:\n\n- Linux (tested on Ubuntu 18.04/20.04 LTS)\n- Python 3.7\n- [CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit-archive) 11\n- [PyTorch](https://pytorch.org/get-started/previous-versions/) 1.12.1\n- [MMCV](https://github.com/open-mmlab/mmcv) 1.6.0\n- [MMGeneration](https://github.com/open-mmlab/mmgeneration) 0.7.2\n\nAlso, this codebase should be able to work on Windows systems as well (tested in the inference mode).\n\nOther dependencies can be installed via `pip install -r requirements.txt`. \n\nAn example of commands for installing the Python packages is shown below:\n\n```bash\n# Export the PATH of CUDA toolkit\nexport PATH=/usr/local/cuda-11.3/bin:$PATH\nexport LD_LIBRARY_PATH=/usr/local/cuda-11.3/lib64:$LD_LIBRARY_PATH\n\n# Create conda environment\nconda create -y -n ssdnerf python=3.7\nconda activate ssdnerf\n\n# Install PyTorch (this script is for CUDA 11.3)\nconda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch\n\n# Install MMCV and MMGeneration\npip install -U openmim\nmim install mmcv-full==1.6\ngit clone https://github.com/open-mmlab/mmgeneration \u0026\u0026 cd mmgeneration \u0026\u0026 git checkout v0.7.2\npip install -v -e .\ncd ..\n\n# Clone this repo and install other dependencies\ngit clone \u003cthis repo\u003e \u0026\u0026 cd \u003crepo folder\u003e\npip install -r requirements.txt\n```\n\n### Compile CUDA packages\n\nThere are two CUDA packages from [torch-ngp](https://github.com/ashawkey/torch-ngp) that need to be built locally.\n\n```bash\ncd lib/ops/raymarching/\npip install -e .\ncd ../shencoder/\npip install -e .\ncd ../../..\n```\n\n## Data preparation\n\nDownload `srn_cars.zip` and `srn_chairs.zip` from [here](https://drive.google.com/drive/folders/1PsT3uKwqHHD2bEEHkIXB99AlIjtmrEiR).\nUnzip them to `./data/shapenet`.\n\nDownload `abo_tables.zip` from [here](https://drive.google.com/file/d/1lzw3uYbpuCxWBYYqYyL4ZEFomBOUN323/view?usp=share_link). Unzip it to `./data/abo`. For convenience I have converted the ABO dataset into PixelNeRF's SRN format.\n\nIf you want to try single-view reconstruction on the real KITTI Cars dataset, please download the official [KITTI 3D object dataset](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d), including [left color images](http://www.cvlibs.net/download.php?file=data_object_image_2.zip), [calibration files](http://www.cvlibs.net/download.php?file=data_object_calib.zip), [training labels](http://www.cvlibs.net/download.php?file=data_object_label_2.zip), and [instance segmentations](https://github.com/HeylenJonas/KITTI3D-Instance-Segmentation-Devkit).\n\nExtract the downloaded archives according to the following folder tree (or use symlinks).\n\n```\n./\n├── configs/\n├── data/\n│   ├── shapenet/\n│   │   ├── cars_test/\n│   │   ├── cars_train/\n│   │   ├── cars_val/\n│   │   ├── chairs_test/\n│   │   ├── chairs_train/\n│   │   └── chairs_val/\n│   ├── abo/\n│   │   ├── tables_train/\n│   │   └── tables_test/\n│   └── kitti/\n│       └── training/\n│           ├── calib/\n│           ├── image_2/\n│           ├── label_2/\n|           └── instance_2/\n├── demo/\n├── lib/\n├── tools/\n…\n```\n\nFor FID and KID evaluation, run the following commands to extract the Inception features of the real images. (This script will use all the available GPUs on your machine, so remember to set `CUDA_VISIBLE_DEVICES`.)\n\n```bash\nCUDA_VISIBLE_DEVICES=0 python tools/inception_stat.py configs/paper_cfgs/ssdnerf_cars_uncond.py\nCUDA_VISIBLE_DEVICES=0 python tools/inception_stat.py configs/paper_cfgs/ssdnerf_chairs_recons1v.py\nCUDA_VISIBLE_DEVICES=0 python tools/inception_stat.py configs/paper_cfgs/ssdnerf_abotables_uncond.py\n```\n\nFor KITTI Cars preprocessing, run the following command.\n\n```bash\npython tools/kitti_preproc.py\n```\n\n## About the configs\n\n### Naming convention\n    \n```\nssdnerf_cars3v_uncond\n   │      │      └── testing data: test unconditional generation\n   │      └── training data: train on Cars dataset, using 3 views per scene\n   └── training method: single-stage diffusion nerf training\n  \nstage2_cars_recons1v\n   │     │      └── testing data: test 3D reconstruction from 1 view\n   │     └── training data: train on Cars dataset, using all views per scene\n   └── training method: stage 2 of two-stage training\n```\n\n### Models in the main paper\n\n| Config                                                                     |                                           Checkpoint                                            | Iters  |     FID      | LPIPS | Comments                                                                                                                                                        |\n|:---------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------:|:------:|:------------:|:-----:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [ssdnerf_cars_uncond](configs/paper_cfgs/ssdnerf_cars_uncond.py)           | [gdrive](https://drive.google.com/file/d/1tZMzfauuB7mo3vc_ojNoiHS5kC4DfBF6/view?usp=drive_link) |   1M   | 11.08 ± 1.11 |   -   |                                                                                                                                                                 |\n| [ssdnerf_abotables_uncond](configs/paper_cfgs/ssdnerf_abotables_uncond.py) | [gdrive](https://drive.google.com/file/d/1AnVELtHRxBE8Hd-KssYlOcQMpNbzD68M/view?usp=drive_link) |   1M   | 14.27 ± 0.66 |   -   |                                                                                                                                                                 |\n| [ssdnerf_cars_recons1v](configs/paper_cfgs/ssdnerf_cars_recons1v.py)       | [gdrive](https://drive.google.com/file/d/1hsnUW7dZ45aPqXxtOVrOSBl1gQA_8wH-/view?usp=drive_link) |  80K   |    16.39     | 0.078 |                                                                                                                                                                 |\n| [ssdnerf_chairs_recons1v](configs/paper_cfgs/ssdnerf_chairs_recons1v.py)   | [gdrive](https://drive.google.com/file/d/1ZvU361JyuIKp6dmhPivdB-18srh5xbsI/view?usp=drive_link) |  80K   |    10.13     | 0.067 |                                                                                                                                                                 |\n| [ssdnerf_cars3v_uncond_1m](configs/paper_cfgs/ssdnerf_cars3v_uncond_1m.py) |                                                                                                 |   1M   |              |   -   | The first half of training before resetting the triplanes.                                                                                                      |\n| [ssdnerf_cars3v_uncond_2m](configs/paper_cfgs/ssdnerf_cars3v_uncond_2m.py) | [gdrive](https://drive.google.com/file/d/1DxpiPAa-pPxjrxhK_DXgJvk2JOgd-WWv/view?usp=drive_link) |   1M   | 19.04 ± 1.10 |   -   | The second half of training after resetting the triplanes (requires training [ssdnerf_cars3v_uncond_1m](configs/paper_cfgs/ssdnerf_cars3v_uncond_1m.py) first). |\n| [ssdnerf_cars3v_recons1v](configs/paper_cfgs/ssdnerf_cars3v_recons1v.py)   |                                                                                                 |  80K   |              | 0.106 |                                                                                                                                                                 |\n| [stage1_cars_recons16v](configs/paper_cfgs/stage1_cars_recons16v.py)       |                                                                                                 |  400K  |              |       | Ablation study, NeRF reconstruction stage.                                                                                                                      |\n| [stage2_cars_uncond](configs/paper_cfgs/stage2_cars_uncond.py)             |                                                                                                 |   1M   | 16.33 ± 0.93 |   -   | Ablation study, diffusion stage (requires training [stage1_cars_recons16v](configs/paper_cfgs/stage1_cars_recons16v.py) first).                                 |\n| [stage2_cars_recons1v](configs/paper_cfgs/stage2_cars_recons1v.py)         |                                                                                                 |  80K   |    20.97     | 0.090 | Ablation study, diffusion stage (requires training [stage1_cars_recons16v](configs/paper_cfgs/stage1_cars_recons16v.py) first).                                 |\n\nIn addition, multi-view reconstruction testing configs can be found in [configs/paper_cfgs/multiview_recons](configs/paper_cfgs/multiview_recons).\n\n### Models in the supplementary material\n\n| Config                                                                            | Iters |  FID  | LPIPS | Comments                                                                                                                                                                                                                                              |\n|:----------------------------------------------------------------------------------|:-----:|:-----:|:-----:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [ssdnerf_cars_reconskitti](configs/supp_cfgs/ssdnerf_cars_reconskitti.py)         |  80K  |   -   |   -   | Same model as [ssdnerf_cars_recons1v](configs/paper_cfgs/ssdnerf_cars_recons1v.py) [[checkpoint](https://drive.google.com/file/d/1hsnUW7dZ45aPqXxtOVrOSBl1gQA_8wH-/view?usp=drive_link)] except for being tested on real images of the KITTI dataset. |\n| [ssdnerf_cars_recons1v_notanh](configs/supp_cfgs/ssdnerf_cars_recons1v_notanh.py) |  80K  | 16.34 | 0.077 | Without tanh latent code activation.                                                                                                                                                                                                                  |                                                                                                                       |                                                                                                                                             |\n| [ssdnerf_cars_recons1v_noreg](configs/supp_cfgs/ssdnerf_cars_recons1v_noreg.py)   |  80K  | 16.62 | 0.077 | Without L2 latent code regularization.                                                                                                                                                                                                                |\n\n### New models in this repository\n\nThe new models feature **improved implementations**, including the following changes:\n\n- Use `NormalizedTanhCode` instead of `TanhCode` activation, which helps stablizing the scale (std) of the latent codes. Scale normalization is no longer required in the DDPM MSE loss. Latent code lr is rescaled accordingly.\n- Remove L2 latent code regularizaiton.\n- Disable U-Net dropout in `recons` models.\n- `uncond` and `recons` models are now exactly the same except for training schedules and testing configs.\n- Enable new features such as 16-bit caching and tiled triplanes.\n\n*Note: It is highly recommended to start with these new models if you want to train custom models. The original models in the paper are retained only for reproducibility.*\n\n| Config                                                                                               | Iters | Comments                                                                                                                                                                                  |\n|:-----------------------------------------------------------------------------------------------------|:-----:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [ssdnerf_cars_uncond_16bit](configs/new_cfgs/ssdnerf_cars_uncond_16bit.py)                           |  1M   | Enable 16-bit caching. Should yield similar results to [ssdnerf_cars_uncond](configs/paper_cfgs/ssdnerf_cars_uncond.py).                                                                  |\n| [ssdnerf_cars_recons1v_16bit](configs/new_cfgs/ssdnerf_cars_recons1v_16bit.py)                       |  60K  | Enable 16-bit caching. Should yield similar results to [ssdnerf_cars_recons1v](configs/paper_cfgs/ssdnerf_cars_recons1v.py).                                                              |                                                                                               |                                                                                                                       |                                                                                                                                             |\n| [ssdnerf_cars_recons1v_tiled](configs/new_cfgs/ssdnerf_cars_recons1v_tiled.py)                       | 100K  | Use tiled (rollout) triplane layout. Tiled triplanes could have resulted in higher computation cost, but in this model the UNet channels have been reduced to compensate for the runtime. |\n| [stage1_cars_recons16v_16bit](configs/new_cfgs/stage1_cars_recons16v_16bit.py)                       | 400K  | Enable 16-bit caching. Should yield similar results to [stage1_cars_recons16v](configs/paper_cfgs/stage1_cars_recons16v.py).                                                              |\n| [stage1_cars_recons16v_16bit_filesystem](configs/new_cfgs/stage1_cars_recons16v_16bit_filesystem.py) | 400K  | Same as [stage1_cars_recons16v_16bit](configs/new_cfgs/stage1_cars_recons16v_16bit) but caching on filesystem, in case your RAM is full. Not recommended due to slow I/O on hard drives.  |\n\n### Unused features in this codebase\n\n- This codebase supports concat-based image conditioning, although it's not used in the above models.\n\n## Training\n\nRun the following command to train a model:\n\n```bash\npython train.py /PATH/TO/CONFIG --gpu-ids 0 1\n```\n\nNote that the total batch size is determined by the number of GPUs you specified. All our models are trained using 2 RTX 3090 (24G) GPUs.\n\nSince we adopt the density-based NeRF pruning trategy in [torch-ngp](https://github.com/ashawkey/torch-ngp), training would start slow and become faster later, so the initial esitamtion of remaining time is usually over twice as much as the actual training time.\n\nModel checkpoints will be saved into `./work_dirs`. Scene caches will be saved into `./cache`.\n\n## Testing and evaluation\n\n```bash\npython test.py /PATH/TO/CONFIG /PATH/TO/CHECKPOINT --gpu-ids 0 1  # you can specify any number of GPUs here\n```\nSome trained models can be downloaded from [here](https://drive.google.com/drive/folders/13z4C13TsofPkBuqMqQjRp5yDck7CjCiZ?usp=sharing) for testing.\n\nTo save the sampled NeRFs and extracted meshes, uncomment (or add) these lines in the `test_cfg` dict of the config file:\n\n```python\n    save_dir=work_dir + '/save',\n    save_mesh=True,\n    mesh_resolution=256,\n    mesh_threshold=10,\n```\n\nAll results will be saved into `./work_dirs/\u003ccfg name\u003e/save`.\nYou can then open the saved `.pth` NeRF scenes using the GUI tool `demo/ssdnerf_gui.py` (see below), and the `.stl` meshes using any mesh viewer.\n\n## Visualization\n\nBy default, during training or testing, the visualizations will be saved into `./work_dirs`. \n\nA GUI tool is provided for visualizing the results (currently only supports unconditional generation or loading saved `.pth` NeRF scenes). Run the following command to start the GUI:\n\n```bash\npython demo/ssdnerf_gui.py /PATH/TO/CONFIG /PATH/TO/CHECKPOINT --fp16\n```\n\n## Citation\n\nIf you find this project useful in your research, please consider citing:\n\n```\n@inproceedings{ssdnerf,\n    title={Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction}, \n    author={Hansheng Chen and Jiatao Gu and Anpei Chen and Wei Tian and Zhuowen Tu and Lingjie Liu and Hao Su},\n    year={2023},\n    booktitle={ICCV}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flakonik%2Fssdnerf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flakonik%2Fssdnerf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flakonik%2Fssdnerf/lists"}