{"id":18614444,"url":"https://github.com/facebookresearch/KeypointNeRF","last_synced_at":"2025-04-11T00:30:46.099Z","repository":{"id":56708937,"uuid":"513913187","full_name":"facebookresearch/KeypointNeRF","owner":"facebookresearch","description":"KeypointNeRF Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints","archived":true,"fork":false,"pushed_at":"2023-05-02T09:25:07.000Z","size":13510,"stargazers_count":374,"open_issues_count":6,"forks_count":28,"subscribers_count":14,"default_branch":"main","last_synced_at":"2024-12-17T01:37:37.427Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/facebookresearch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2022-07-14T13:30:41.000Z","updated_at":"2024-12-11T16:06:04.000Z","dependencies_parsed_at":"2024-02-18T00:40:45.027Z","dependency_job_id":null,"html_url":"https://github.com/facebookresearch/KeypointNeRF","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2FKeypointNeRF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2FKeypointNeRF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2FKeypointNeRF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2FKeypointNeRF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/facebookresearch","download_url":"https://codeload.github.com/facebookresearch/KeypointNeRF/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248322220,"owners_count":21084333,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T03:25:56.785Z","updated_at":"2025-04-11T00:30:41.083Z","avatar_url":"https://github.com/facebookresearch.png","language":"Python","funding_links":[],"categories":["Papers"],"sub_categories":["NeRF Related Tasks"],"readme":"\u003cp align=\"center\"\u003e\n  \u003cp align=\"center\"\u003e\n    \u003ch1 align=\"center\"\u003eKeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints\u003c/h1\u003e\n  \u003c/p\u003e\n  \u003cp align=\"center\" style=\"font-size:16px\"\u003e\n    \u003ca target=\"_blank\" href=\"https://markomih.github.io/\"\u003e\u003cstrong\u003eMarko Mihajlovic\u003c/strong\u003e\u003c/a\u003e\n    ·\n    \u003ca target=\"_blank\" href=\"https://www.aayushbansal.xyz/\"\u003e\u003cstrong\u003eAayush Bansal\u003c/strong\u003e\u003c/a\u003e\n    ·\n    \u003ca target=\"_blank\" href=\"https://zollhoefer.com/\"\u003e\u003cstrong\u003eMichael Zollhoefer\u003c/strong\u003e\u003c/a\u003e\n    .\n    \u003ca target=\"_blank\" href=\"https://inf.ethz.ch/people/person-detail.MjYyNzgw.TGlzdC8zMDQsLTg3NDc3NjI0MQ==.html\"\u003e\u003cstrong\u003eSiyu Tang\u003c/strong\u003e\u003c/a\u003e\n    ·\n    \u003ca target=\"_blank\" href=\"https://shunsukesaito.github.io/\"\u003e\u003cstrong\u003eShunsuke Saito\u003c/strong\u003e\u003c/a\u003e\n  \u003c/p\u003e\n  \u003ch2 align=\"center\"\u003eECCV 2022\u003c/h2\u003e\n  \u003cp\u003e\n  KeypointNeRF leverages human keypoints to instantly generate volumetric radiance representation from 2-3 input images without retraining or fine-tuning.\n  It can represent human faces and full bodies. \n  \u003c/p\u003e\n  \u003cdiv align=\"center\"\u003e\u003c/div\u003e \u003cimg src=\"./assets/keynerf.gif\" alt=\"Logo\" width=\"100%\"\u003e\n \n  \u003cdiv align=\"center\"\u003e\u003c/div\u003e \u003cimg src=\"./assets/teaser.gif\" alt=\"Logo\" width=\"100%\"\u003e\n\n  \u003cp align=\"center\"\u003e\n    \u003ca href=\"https://pytorch.org/get-started/locally/\"\u003e\u003cimg alt=\"PyTorch\" src=\"https://img.shields.io/badge/PyTorch-ee4c2c?logo=pytorch\u0026logoColor=white\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://pytorchlightning.ai/\"\u003e\u003cimg alt=\"Lightning\" src=\"https://img.shields.io/badge/-Lightning-792ee5?logo=pytorchlightning\u0026logoColor=white\"\u003e\u003c/a\u003e\n    \u003cbr\u003e\n    \u003ca href='https://arxiv.org/abs/2205.04992'\u003e\n      \u003cimg src='https://img.shields.io/badge/Paper-PDF-green?style=for-the-badge\u0026logo=arXiv\u0026logoColor=green' alt='Paper PDF'\u003e\n    \u003c/a\u003e\n    \u003ca href='https://markomih.github.io/KeypointNeRF/' style='padding-left: 0.5rem;'\u003e\n      \u003cimg src='https://img.shields.io/badge/KeypointNeRF-Page-orange?style=for-the-badge\u0026logo=Google%20chrome\u0026logoColor=orange' alt='Project Page'\u003e\n    \u003ca href=\"https://youtu.be/RMs1S5k9vrk\"\u003e\u003cimg alt=\"youtube views\" title=\"Subscribe to my YouTube channel\" src=\"https://img.shields.io/youtube/views/RMs1S5k9vrk?logo=youtube\u0026labelColor=ce4630\u0026style=for-the-badge\"/\u003e\u003c/a\u003e\n  \u003c/p\u003e\n  \u003cp align=\"center\"\u003e\u003ca href='https://paperswithcode.com/sota/generalizable-novel-view-synthesis-on-zju?p=keypointnerf-generalizing-image-based'\u003e\n\t\u003cimg src='https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/keypointnerf-generalizing-image-based/generalizable-novel-view-synthesis-on-zju' alt='Generalizable Novel View Synthesis'\u003e\u003c/a\u003e\n    \u003ca href='https://paperswithcode.com/sota/3d-human-reconstruction-on-cape?p=keypointnerf-generalizing-image-based'\u003e\n\t\u003cimg src='https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/keypointnerf-generalizing-image-based/3d-human-reconstruction-on-cape' alt='Generalizable Novel View Synthesis'\u003e\u003c/a\u003e\u003c/p\u003e\n\n\u003c/p\u003e\n\n## News :new:\n- [2022/10/01] Combine [ICON](https://github.com/YuliangXiu/ICON) with our relative spatial keypoint encoding for fast and convenient monocular reconstruction, without requiring the expensive SMPL feature. \nMore details are [here](#Reconstruction-from-a-Single-Image). \n\n## Installation \nPlease install python dependencies specified in `environment.yml`:\n```bash\nconda env create -f environment.yml\nconda activate KeypointNeRF\n```\n\n## Data preparation\nPlease see [DATA_PREP.md](DATA_PREP.md) to setup the ZJU-MoCap dataset.\n\nAfter this step the data directory follows the structure:\n```bash\n./data/zju_mocap\n├── CoreView_313\n├── CoreView_315\n├── CoreView_377\n├── CoreView_386\n├── CoreView_387\n├── CoreView_390\n├── CoreView_392\n├── CoreView_393\n├── CoreView_394\n└── CoreView_396\n```\n\n## Train your own model on the ZJU dataset\nExecute `train.py` script to train the model on the ZJU dataset.\n```shell script\npython train.py --config ./configs/zju.json --data_root ./data/zju_mocap\n```\nAfter the training, the model checkpoint will be stored under `./EXPERIMENTS/zju/ckpts/last.ckpt`, which is equivalent to the one provided [here](https://drive.google.com/file/d/1rsMb3DFFXaFw0iK7yoUmoDEaCW_XqfaN/view?usp=sharing).\n\n## Evaluation\nTo extract render and evaluate images, execute:\n```shell script\npython train.py --config ./configs/zju.json --data_root ./data/zju_mocap --run_val\npython eval_zju.py --src_dir ./EXPERIMENTS/zju/images_v3\n```\n\nTo visualize the dynamic results, execute:\n```shell\npython render_dynamic.py --config ./configs/zju.json --data_root ./data/zju_mocap --model_ckpt ./EXPERIMENTS/zju/ckpts/last.ckpt\n```\n\n\u003cdiv align=\"center\"\u003e\u003c/div\u003e \u003cimg src=\"./assets/zju_result_sub393.gif\" alt=\"Logo\" width=\"100%\"\u003e\n\u003cp align=\"center\" style=\"font-size:12px\"\u003e (The first three views of an unseen subject are the input to KeypointNeRF; the last image is a rendered novel view) \u003c/p\u003e\n\n\nWe compare KeypointNeRF with recent state-of-the-art methods. The evaluation metric is SSIM and PSNR.\n| Models  | PSNR \u0026#8593;  | SSIM \u0026#8593;  |\n|---|---|---|\n| pixelNeRF \u003cfont size=\"1\"\u003e(Yu et al., CVPR'21)\u003c/font\u003e  |   23.17     | 86.93   |\n| PVA \u003cfont size=\"1\"\u003e(Raj et al., CVPR'21)\u003c/font\u003e  |   23.15     | 86.63   |\n| NHP \u003cfont size=\"1\"\u003e(Kwon et al., NeurIPS'21)\u003c/font\u003e  |   24.75     | 90.58   |\n| KeypointNeRF* \u003cfont size=\"1\"\u003e(Mihajlovic et al., ECCV'22)\u003c/font\u003e  |   **25.86** | **91.07**   |\n\u003cp align=\"left\" style=\"font-size:10px\"\u003e (*Note that results of KeypointNeRF are slightly higher compared to the numbers reported in the original paper due to training views not beeing shuffled during training.) \u003c/p\u003e\n\n## Reconstruction from a Single Image\nOur relative spatial encoding can be used to reconstruct humans from a single image. \nAs a example, we leverage ICON and replace its expensive SDF feature with our relative spatial encoding. \n\n\u003cdiv align=\"center\"\u003e\u003cimg src=\"./assets/icon_vs_kpts.png\" alt=\"Logo\" width=\"100%\"\u003e\nWhile it achieves comparable quality to ICON, it's much \u003cstrong\u003efaster\u003c/strong\u003e and more \u003cstrong\u003econvinient\u003c/strong\u003e to use \u003cspan align=\"left\" style=\"font-size:8px\"\u003e(*displayed image taken from pinterest.com)\u003c/span\u003e.\n\u003c/div\u003e \n\n### 3D Human Reconstruction on CAPE\n| Models  | Chamfer \u0026#8595; (cm)  | P2S \u0026#8595; (cm) |\n|---|---|---|\n| PIFu \u003cfont size=\"1\"\u003e(Saito et al., ICCV'19)\u003c/font\u003e  |   3.573     | 1.483   |\n| ICON \u003cfont size=\"1\"\u003e(Xiu et al., CVPR'22)\u003c/font\u003e          |   1.424     | 1.351   |\n| KeypointICON \u003cfont size=\"1\"\u003e(Mihajlovic et al., ECCV'22; Xiu et al., CVPR'22)\u003c/font\u003e  |   1.539\t | 1.358   |\n\nCheck the benchmark [here](https://paperswithcode.com/sota/3d-human-reconstruction-on-cape) and more details [here](https://github.com/YuliangXiu/ICON/blob/master/docs/evaluation.md).\n\n## Publication\nIf you find our code or paper useful, please consider citing:\n```bibtex\n@inproceedings{Mihajlovic:ECCV2022,\n  title = {{KeypointNeRF}: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints},\n  author = {Mihajlovic, Marko and Bansal, Aayush and Zollhoefer, Michael and Tang, Siyu and Saito, Shunsuke},\n  booktitle={European conference on computer vision},\n  year={2022},\n}\n```\n\n## License\n[CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/legalcode). \nSee the [LICENSE](LICENSE) file. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2FKeypointNeRF","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffacebookresearch%2FKeypointNeRF","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2FKeypointNeRF/lists"}