{"id":18614435,"url":"https://github.com/primecai/Pix2NeRF","last_synced_at":"2025-04-11T00:30:45.543Z","repository":{"id":37343742,"uuid":"463695961","full_name":"primecai/Pix2NeRF","owner":"primecai","description":null,"archived":false,"fork":false,"pushed_at":"2022-09-08T21:29:43.000Z","size":6492,"stargazers_count":272,"open_issues_count":13,"forks_count":33,"subscribers_count":29,"default_branch":"main","last_synced_at":"2024-11-07T03:31:31.380Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/primecai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-02-25T22:48:04.000Z","updated_at":"2024-11-03T18:15:44.000Z","dependencies_parsed_at":"2022-07-12T12:31:45.796Z","dependency_job_id":null,"html_url":"https://github.com/primecai/Pix2NeRF","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/primecai%2FPix2NeRF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/primecai%2FPix2NeRF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/primecai%2FPix2NeRF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/primecai%2FPix2NeRF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/primecai","download_url":"https://codeload.github.com/primecai/Pix2NeRF/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248322218,"owners_count":21084333,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T03:25:56.601Z","updated_at":"2025-04-11T00:30:40.524Z","avatar_url":"https://github.com/primecai.png","language":"Python","funding_links":[],"categories":["Papers"],"sub_categories":["NeRF Related Tasks"],"readme":"# Pix2NeRF: Unsupervised Conditional π-GAN for Single Image to Neural Radiance Fields Translation ([CVPR 2022](https://cvpr2022.thecvf.com/))\n[Video](https://www.youtube.com/watch?v=RoVu3hvvzGg) | [Paper](https://arxiv.org/abs/2202.13162)\n\n![Teaser image](figures/teaser.jpg)\n\n**Pix2NeRF: Unsupervised Conditional π-GAN for Single Image to Neural Radiance Fields Translation**\u003cbr\u003e\n[Shengqu Cai](https://primecai.github.io/), [Anton Obukhov](https://www.obukhov.ai/), [Dengxin Dai](https://vas.mpi-inf.mpg.de/dengxin/), [Luc Van Gool](https://ee.ethz.ch/the-department/faculty/professors/person-detail.OTAyMzM=.TGlzdC80MTEsMTA1ODA0MjU5.html)\n\nAbstract: *We propose a pipeline to generate Neural Radiance Fields (NeRF) of an object or a scene of a specific class, conditioned on a single input image. This is a challenging task, as training NeRF requires multiple views of the same scene, coupled with corresponding poses, which are hard to obtain. Our method is based on π-GAN, a generative model for unconditional 3D-aware image synthesis, which maps random latent codes to radiance fields of a class of objects. We jointly optimize (1) the π-GAN objective to utilize its high-fidelity 3D-aware generation and (2) a carefully designed reconstruction objective. The latter includes an encoder coupled with π-GAN generator to form an auto-encoder. Unlike previous few-shot NeRF approaches, our pipeline is unsupervised, capable of being trained with independent images without 3D, multi-view, or pose supervision. Applications of our pipeline include 3d avatar generation, object-centric novel view synthesis with a single input image, and 3d-aware super-resolution, to name a few.*\n\n\n\n\n## Instructions\n### Environment\nWe use pytorch 1.7.0 with CUDA 10.1. To build the environment, run:\n\n```\npip install -r requirements.txt\n```\n\n### Download and pre-process datasets\nFor CelebA, download from https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html and extract the `img_align_celeba` split.\n\nFor Carla, download from https://github.com/autonomousvision/graf.\n\nFor ShapeNet-SRN, download from https://github.com/sxyu/pixel-nerf and remove the additional layer, so that there are 3 folders `chairs_train`, `chairs_val` and `chairs_test` within `srn_chairs`. Instances should be directly within these three folders.\n\nCopy `img_csv/CelebA_pos.csv` to /PATH_TO/img_align_celeba/.\n\nCopy `srn_chairs_train.csv`, `srn_chairs_train_filted.csv`, `srn_chairs_val.csv`, `srn_chairs_val_filted.csv`, `srn_chairs_test.csv` and `srn_chairs_test_filted.csv` under `/PATH_TO/srn_chairs`.\n\n### Training\nCELEBA\n\n`\nCUDA_VISIBLE_DEVICES=0,1,2,3 python3 train_con.py --curriculum=celeba --output_dir='/PATH_TO_OUTPUT/' --dataset_dir='/PATH_TO/img_align_celeba' --encoder_type='CCS' --recon_lambda=5 --ssim_lambda=1 --vgg_lambda=1 --pos_lambda_gen=15 --lambda_e_latent=1 --lambda_e_pos=1 --cond_lambda=1 --load_encoder=1\n`\n\nCARLA\n\n`\nCUDA_VISIBLE_DEVICES=0,1,2,3 python3 train_con.py --curriculum=carla --output_dir='/PATH_TO_OUTPUT/' --dataset_dir='/PATH_TO/carla/*.png' --encoder_type='CCS' --recon_lambda=5 --ssim_lambda=1 --vgg_lambda=1 --pos_lambda_gen=15 --lambda_e_latent=1 --lambda_e_pos=1 --cond_lambda=1 --load_encoder=1\n`\n\nSHAPENET CHAIRS\n\n`\nCUDA_VISIBLE_DEVICES=0,1,2,3 python3 train_con.py --curriculum=srnchairs --output_dir='/PATH_TO_OUTPUT/' --dataset_dir='/PATH_TO/srn_chairs' --encoder_type='CCS' --recon_lambda=5 --ssim_lambda=1 --vgg_lambda=1 --pos_lambda_gen=15 --lambda_e_latent=1 --lambda_e_pos=1 --cond_lambda=1 --load_encoder=1\n`\n\nNote that the training script has been refactored and has not been fully validated yet. It may not reproduce exactly the results from the paper. Please let the authors know if results are not at reasonable levels!\n### Visualizing\nRender novel views of the given image:\n\n`python render_video_from_img.py --path=/PATH_TO/checkpoint_train.pth \n                                   --output_dir=/PATH_TO_WRITE_TO/ \n                                   --img_path=/PATH_TO_IMAGE/ \n                                   --curriculum=\"celeba\" or \"carla\" or \"srnchairs\"`\n\n\nRender videos and create gifs for the three datasets:\n\nCELEBA\n\n`\npython render_video_from_dataset.py --path PRETRAINED_MODEL_PATH \n                                    --output_dir OUTPUT_DIRECTORY \n                                    --curriculum \"celeba\" \n                                    --dataset_path \"/PATH/TO/img_align_celeba/\" \n                                    --trajectory \"front\"\n`\n\nCARLA\n\n`python render_video_from_dataset.py --path PRETRAINED_MODEL_PATH \n                                     --output_dir OUTPUT_DIRECTORY \n                                     --curriculum \"carla\" \n                                     --dataset_path \"/PATH/TO/carla/*.png\" \n                                     --trajectory \"orbit\"`\n\nSRNCHAIRS\n\n`python render_video_from_dataset.py --path PRETRAINED_MODEL_PATH \n                                     --output_dir OUTPUT_DIRECTORY \n                                     --curriculum \"srnchairs\" \n                                     --dataset_path \"/PATH/TO/srn_chairs/\" \n                                     --trajectory \"orbit\"`\n\n### Linear interpolation\nRender images and a video interpolating between 2 images.\n\n`python linear_interpolation --path=/PATH_TO/checkpoint_train.pth --output_dir=/PATH_TO_WRITE_TO/`\n\n### Hybrid Optimization\nSince our model is feed-forward and uses a relatively compact latent codes, it most likely will not perform that well on yourself/very familiar faces---the details are very challenging to be fully captured by a single pass. Therefore, we provide a script performing hybrid optimization: predict a latent code using our model, then perform latent optimization as introduced in pi-GAN. The command to use is:\n\n`\npython --path PRETRAINED_MODEL_PATH --output_dir OUTPUT_DIRECTORY --curriculum [\"celeba\" or \"carla\" or \"srnchairs\"] --img_path /PATH_TO_IMAGE_TO_OPTIMIZE/   \n`\nNote that compare with vanilla pi-GAN inversion, we need significantly less iterations.\n\n## Pretrained model\nWe provide pretrained model checkpoint files for the three datasets. Download from https://www.dropbox.com/s/lcko0wl8rs4k5qq/pretrained_models.zip?dl=0 and unzip to use.\n\n## Citation\n\n```\n@inproceedings{cai2022pix2nerf,\n  title={Pix2NeRF: Unsupervised Conditional p-GAN for Single Image to Neural Radiance Fields Translation},\n  author={Cai, Shengqu and Obukhov, Anton and Dai, Dengxin and Van Gool, Luc},\n  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},\n  pages={3981--3990},\n  year={2022}\n}\n```\n\n## Acknowledgements\nThe code repo is built upon https://github.com/marcoamonteiro/pi-GAN. We thank the authors for releasing the code and providing support throughout the development of this project.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprimecai%2FPix2NeRF","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprimecai%2FPix2NeRF","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprimecai%2FPix2NeRF/lists"}