https://github.com/facebookresearch/KeypointNeRF

KeypointNeRF Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints
https://github.com/facebookresearch/KeypointNeRF

Last synced: 3 months ago
JSON representation

KeypointNeRF Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints

Host: GitHub
URL: https://github.com/facebookresearch/KeypointNeRF
Owner: facebookresearch
License: other
Archived: true
Created: 2022-07-14T13:30:41.000Z (about 3 years ago)
Default Branch: main
Last Pushed: 2023-05-02T09:25:07.000Z (about 2 years ago)
Last Synced: 2024-12-17T01:37:37.427Z (7 months ago)
Language: Python
Homepage:
Size: 12.9 MB
Stars: 374
Watchers: 14
Forks: 28
Open Issues: 6
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

awesome-NeRF - Torch - based Volumetric Avatars using Relative Spatial Encoding of Keypoints](https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/563568/1322.pdf?sequence=1)|[Project Page](https://markomih.github.io/KeypointNeRF/)| (Papers / NeRF Related Tasks)
awesome-NeRF - Torch - based Volumetric Avatars using Relative Spatial Encoding of Keypoints](https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/563568/1322.pdf?sequence=1)|[Project Page](https://markomih.github.io/KeypointNeRF/)| (Papers / NeRF Related Tasks)

README

        


  


    
KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints

  

  

    Marko Mihajlovic

    ·

    Aayush Bansal

    ·

    Michael Zollhoefer

    .

    Siyu Tang

    ·

    Shunsuke Saito

  

  ECCV 2022

  

  KeypointNeRF leverages human keypoints to instantly generate volumetric radiance representation from 2-3 input images without retraining or fine-tuning.

  It can represent human faces and full bodies. 

  

  
 

 

  
 

  


    

    

    


    

      

    

    

      

    

  

  

	

    

	


## News :new:

- [2022/10/01] Combine [ICON](https://github.com/YuliangXiu/ICON) with our relative spatial keypoint encoding for fast and convenient monocular reconstruction, without requiring the expensive SMPL feature. 

More details are [here](#Reconstruction-from-a-Single-Image). 

## Installation 

Please install python dependencies specified in `environment.yml`:

```bash

conda env create -f environment.yml

conda activate KeypointNeRF

```

## Data preparation

Please see [DATA_PREP.md](DATA_PREP.md) to setup the ZJU-MoCap dataset.

After this step the data directory follows the structure:

```bash

./data/zju_mocap

├── CoreView_313

├── CoreView_315

├── CoreView_377

├── CoreView_386

├── CoreView_387

├── CoreView_390

├── CoreView_392

├── CoreView_393

├── CoreView_394

└── CoreView_396

```

## Train your own model on the ZJU dataset

Execute `train.py` script to train the model on the ZJU dataset.

```shell script

python train.py --config ./configs/zju.json --data_root ./data/zju_mocap

```

After the training, the model checkpoint will be stored under `./EXPERIMENTS/zju/ckpts/last.ckpt`, which is equivalent to the one provided [here](https://drive.google.com/file/d/1rsMb3DFFXaFw0iK7yoUmoDEaCW_XqfaN/view?usp=sharing).

## Evaluation

To extract render and evaluate images, execute:

```shell script

python train.py --config ./configs/zju.json --data_root ./data/zju_mocap --run_val

python eval_zju.py --src_dir ./EXPERIMENTS/zju/images_v3

```

To visualize the dynamic results, execute:

```shell

python render_dynamic.py --config ./configs/zju.json --data_root ./data/zju_mocap --model_ckpt ./EXPERIMENTS/zju/ckpts/last.ckpt

```

 

 (The first three views of an unseen subject are the input to KeypointNeRF; the last image is a rendered novel view) 


We compare KeypointNeRF with recent state-of-the-art methods. The evaluation metric is SSIM and PSNR.

| Models  | PSNR ↑  | SSIM ↑  |

|---|---|---|

| pixelNeRF (Yu et al., CVPR'21)  |   23.17     | 86.93   |

| PVA (Raj et al., CVPR'21)  |   23.15     | 86.63   |

| NHP (Kwon et al., NeurIPS'21)  |   24.75     | 90.58   |

| KeypointNeRF* (Mihajlovic et al., ECCV'22)  |   **25.86** | **91.07**   |

 (*Note that results of KeypointNeRF are slightly higher compared to the numbers reported in the original paper due to training views not beeing shuffled during training.) 


## Reconstruction from a Single Image

Our relative spatial encoding can be used to reconstruct humans from a single image. 

As a example, we leverage ICON and replace its expensive SDF feature with our relative spatial encoding. 





While it achieves comparable quality to ICON, it's much faster and more convinient to use (*displayed image taken from pinterest.com).

 

### 3D Human Reconstruction on CAPE

| Models  | Chamfer ↓ (cm)  | P2S ↓ (cm) |

|---|---|---|

| PIFu (Saito et al., ICCV'19)  |   3.573     | 1.483   |

| ICON (Xiu et al., CVPR'22)          |   1.424     | 1.351   |

| KeypointICON (Mihajlovic et al., ECCV'22; Xiu et al., CVPR'22)  |   1.539	 | 1.358   |

Check the benchmark [here](https://paperswithcode.com/sota/3d-human-reconstruction-on-cape) and more details [here](https://github.com/YuliangXiu/ICON/blob/master/docs/evaluation.md).

## Publication

If you find our code or paper useful, please consider citing:

```bibtex

@inproceedings{Mihajlovic:ECCV2022,

  title = {{KeypointNeRF}: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints},

  author = {Mihajlovic, Marko and Bansal, Aayush and Zollhoefer, Michael and Tang, Siyu and Saito, Shunsuke},

  booktitle={European conference on computer vision},

  year={2022},

}

```

## License

[CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/legalcode). 

See the [LICENSE](LICENSE) file.