Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/facebookresearch/KeypointNeRF
KeypointNeRF Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints
https://github.com/facebookresearch/KeypointNeRF
Last synced: about 22 hours ago
JSON representation
KeypointNeRF Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints
- Host: GitHub
- URL: https://github.com/facebookresearch/KeypointNeRF
- Owner: facebookresearch
- License: other
- Archived: true
- Created: 2022-07-14T13:30:41.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-05-02T09:25:07.000Z (over 1 year ago)
- Last Synced: 2024-03-04T16:48:12.052Z (8 months ago)
- Language: Python
- Homepage:
- Size: 12.9 MB
- Stars: 368
- Watchers: 14
- Forks: 27
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- awesome-NeRF - Torch - based Volumetric Avatars using Relative Spatial Encoding of Keypoints](https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/563568/1322.pdf?sequence=1)|[Project Page](https://markomih.github.io/KeypointNeRF/)| (Papers / NeRF Related Tasks)
- awesome-NeRF - Torch - based Volumetric Avatars using Relative Spatial Encoding of Keypoints](https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/563568/1322.pdf?sequence=1)|[Project Page](https://markomih.github.io/KeypointNeRF/)| (Papers / NeRF Related Tasks)
README
KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints
Marko Mihajlovic
·
Aayush Bansal
·
Michael Zollhoefer
.
Siyu Tang
·
Shunsuke Saito
ECCV 2022
KeypointNeRF leverages human keypoints to instantly generate volumetric radiance representation from 2-3 input images without retraining or fine-tuning.
It can represent human faces and full bodies.
## News :new:
- [2022/10/01] Combine [ICON](https://github.com/YuliangXiu/ICON) with our relative spatial keypoint encoding for fast and convenient monocular reconstruction, without requiring the expensive SMPL feature.
More details are [here](#Reconstruction-from-a-Single-Image).## Installation
Please install python dependencies specified in `environment.yml`:
```bash
conda env create -f environment.yml
conda activate KeypointNeRF
```## Data preparation
Please see [DATA_PREP.md](DATA_PREP.md) to setup the ZJU-MoCap dataset.After this step the data directory follows the structure:
```bash
./data/zju_mocap
├── CoreView_313
├── CoreView_315
├── CoreView_377
├── CoreView_386
├── CoreView_387
├── CoreView_390
├── CoreView_392
├── CoreView_393
├── CoreView_394
└── CoreView_396
```## Train your own model on the ZJU dataset
Execute `train.py` script to train the model on the ZJU dataset.
```shell script
python train.py --config ./configs/zju.json --data_root ./data/zju_mocap
```
After the training, the model checkpoint will be stored under `./EXPERIMENTS/zju/ckpts/last.ckpt`, which is equivalent to the one provided [here](https://drive.google.com/file/d/1rsMb3DFFXaFw0iK7yoUmoDEaCW_XqfaN/view?usp=sharing).## Evaluation
To extract render and evaluate images, execute:
```shell script
python train.py --config ./configs/zju.json --data_root ./data/zju_mocap --run_val
python eval_zju.py --src_dir ./EXPERIMENTS/zju/images_v3
```To visualize the dynamic results, execute:
```shell
python render_dynamic.py --config ./configs/zju.json --data_root ./data/zju_mocap --model_ckpt ./EXPERIMENTS/zju/ckpts/last.ckpt
```(The first three views of an unseen subject are the input to KeypointNeRF; the last image is a rendered novel view)
We compare KeypointNeRF with recent state-of-the-art methods. The evaluation metric is SSIM and PSNR.
| Models | PSNR ↑ | SSIM ↑ |
|---|---|---|
| pixelNeRF (Yu et al., CVPR'21) | 23.17 | 86.93 |
| PVA (Raj et al., CVPR'21) | 23.15 | 86.63 |
| NHP (Kwon et al., NeurIPS'21) | 24.75 | 90.58 |
| KeypointNeRF* (Mihajlovic et al., ECCV'22) | **25.86** | **91.07** |(*Note that results of KeypointNeRF are slightly higher compared to the numbers reported in the original paper due to training views not beeing shuffled during training.)
## Reconstruction from a Single Image
Our relative spatial encoding can be used to reconstruct humans from a single image.
As a example, we leverage ICON and replace its expensive SDF feature with our relative spatial encoding.
While it achieves comparable quality to ICON, it's much faster and more convinient to use (*displayed image taken from pinterest.com).### 3D Human Reconstruction on CAPE
| Models | Chamfer ↓ (cm) | P2S ↓ (cm) |
|---|---|---|
| PIFu (Saito et al., ICCV'19) | 3.573 | 1.483 |
| ICON (Xiu et al., CVPR'22) | 1.424 | 1.351 |
| KeypointICON (Mihajlovic et al., ECCV'22; Xiu et al., CVPR'22) | 1.539 | 1.358 |Check the benchmark [here](https://paperswithcode.com/sota/3d-human-reconstruction-on-cape) and more details [here](https://github.com/YuliangXiu/ICON/blob/master/docs/evaluation.md).
## Publication
If you find our code or paper useful, please consider citing:
```bibtex
@inproceedings{Mihajlovic:ECCV2022,
title = {{KeypointNeRF}: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints},
author = {Mihajlovic, Marko and Bansal, Aayush and Zollhoefer, Michael and Tang, Siyu and Saito, Shunsuke},
booktitle={European conference on computer vision},
year={2022},
}
```## License
[CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/legalcode).
See the [LICENSE](LICENSE) file.