https://github.com/mrtornado24/next3d

[CVPR 2023 Highlight] Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars
https://github.com/mrtornado24/next3d

3d-aware-image-synthesis gan nerf

Last synced: 4 days ago
JSON representation

[CVPR 2023 Highlight] Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars

Host: GitHub
URL: https://github.com/mrtornado24/next3d
Owner: MrTornado24
Created: 2022-11-21T17:56:20.000Z (over 2 years ago)
Default Branch: master
Last Pushed: 2024-10-13T03:55:36.000Z (7 months ago)
Last Synced: 2025-04-08T13:05:42.106Z (about 1 month ago)
Topics: 3d-aware-image-synthesis, gan, nerf
Language: Python
Homepage: https://mrtornado24.github.io/Next3D/
Size: 2.33 MB
Stars: 482
Watchers: 27
Forks: 31
Open Issues: 25
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

## Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars

![Teaser image](./docs/teaser_v3.jpeg)

**Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars**

[Jingxiang Sun](https://mrtornado24.github.io/), [Xuan Wang](https://xuanwangvc.github.io/), [Lizhen Wang](https://lizhenwangt.github.io/), [Xiaoyu Li](https://xiaoyu258.github.io/), [Yong Zhang](https://yzhang2016.github.io/yongnorriszhang.github.io/), [Hongwen Zhang](https://hongwenzhang.github.io/), [Yebin Liu](http://www.liuyebin.com/)

[**Project**](https://mrtornado24.github.io/Next3D/) | [**Paper**](https://arxiv.org/abs/2211.11208) | [**Twitter**](https://twitter.com/JingxiangSun42/status/1630489816226988032?s=20)

Abstract: *3D-aware generative adversarial networks (GANs) synthesize high-fidelity and multi-view-consistent facial images using only collections of single-view 2D imagery. Towards fine-grained control over facial attributes, recent efforts incorporate 3D Morphable Face Model (3DMM) to describe
deformation in generative radiance fields either explicitly or implicitly. Explicit methods provide fine-grained expression control but cannot handle topological changes caused
by hair and accessories, while implicit ones can model varied topologies but have limited generalization caused by the
unconstrained deformation fields. We propose a novel 3D
GAN framework for unsupervised learning of generative,
high-quality and 3D-consistent facial avatars from unstructured 2D images. To achieve both deformation accuracy
and topological flexibility, we propose a 3D representation
called Generative Texture-Rasterized Tri-planes. The proposed representation learns Generative Neural Textures on
top of parametric mesh templates and then projects them
into three orthogonal-viewed feature planes through rasterization, forming a tri-plane feature representation for volume rendering. In this way, we combine both fine-grained
expression control of mesh-guided explicit deformation and the flexibility of implicit volumetric representation. We further propose specific modules for modeling mouth interior
which is not taken into account by 3DMM. Our method demonstrates state-of-the-art 3D-aware synthesis quality and animation ability through extensive experiments. Furthermore, serving as 3D prior, our animatable 3D repre-
sentation boosts multiple applications including one-shot facial avatars and 3D-aware stylization.*

## News
[Oct 2024] We released the [code](https://github.com/XChenZ/invertAvatar) for our new SIGGRAPH 2024 paper: ["InvertAvatar: Incremental GAN Inversion for Generalized Head Avatars"](https://xchenz.github.io/invertavatar_page/)! InvertAvatar is built on an improved Next3D backbone and enables building high-quality 3D personal head avatar in 1s.

## Requirements

* 1–8 high-end NVIDIA GPUs. We have done all testing and development using V100, RTX3090, and A100 GPUs.
* 64-bit Python 3.9 and PyTorch 1.12.0 (or later). See https://pytorch.org for PyTorch install instructions.
* CUDA toolkit 11.3 or later.
* Python libraries: see [environment.yml](./environment.yml) for exact library dependencies. You can use the following commands with Miniconda3 to create and activate your Python environment:
- `cd Next3D`
- `conda env create -f environment.yml`
- `conda activate next3d`

## Getting started

Download our pretrained models following [the link](https://drive.google.com/drive/folders/1rbR5ZJ6LQYUSd5J5BkoVYNon_-Lb7KsZ?usp=share_link) and put it under `pretrained_models`. For training Next3D on the top of EG3D, please also download the pretrained checkpoint `ffhqrebalanced512-64.pkl` of [EG3D](https://github.com/NVlabs/eg3d/blob/main/docs/models.md).

## Generating media

```.bash
# Generate videos for the shown cases using pre-trained model

python gen_videos_next3d.py --outdir=out --trunc=0.7 --seeds=10720,12374,13393,17099 --grid=2x2 \
--network=pretrained_models/next3d_ffhq_512.pkl --obj_path=data/demo/demo.obj \
--lms_path=data/demo/demo_kpt2d.txt --lms_cond=True
```

```.bash
# Generate images and shapes (as .mrc files) for the shown cases using pre-trained model

python gen_samples.py --outdir=out --trunc=0.7 --shapes=true --seeds=166 \
--network=pretrained_models/next3d_ffhq_512.pkl --obj_path=data/demo/demo.obj \
--lms_path=data/demo/demo_kpt2d.txt --lms_cond=True
```

We visualize our .mrc shape files with [UCSF Chimerax](https://www.cgl.ucsf.edu/chimerax/). Please refer to [EG3D](https://github.com/NVlabs/eg3d) for more detailed instructions.

## Reenacting generative avatars

### Installation

Ensure the [Deep3DFaceRecon_pytorch](https://github.com/sicxu/Deep3DFaceRecon_pytorch/tree/6ba3d22f84bf508f0dde002da8fff277196fef21) submodule is properly initialized
```.bash
git submodule update --init --recursive
```
Download the pretrained models for FLAME estimation following [DECA](https://github.com/yfeng95/DECA) and put them into `dataset_preprocessing/ffhq/deca/data`; download the pretrained models for gaze estimation through the [link](https://drive.google.com/drive/folders/1Jgej9q5W2IYXRa-CWCldyTVXeHk-Oi-I?usp=share_link) and put them into `dataset_preprocessing/ffhq/faceverse/data`.

### Preparing datasets

The video reenactment input contains three parts: camera poses `dataset.json`, FLAME meshes ('.obj') and 2D landmark files ('.txt'). For quick start, you can download the processed talking video of President Obama [here](https://drive.google.com/file/d/1ph77uSlLz-xIVlBxwXP3Et7lTR0zHXQR/view?usp=sharing) and place the downloaded folder as `data/obama`. You can also preprocess your custom datasets by running the following commands:

```.bash
cd dataset_preprocessing/ffhq
python preprocess_in_the_wild.py --indir=INPUT_IMAGE_FOLDER
```

You will obtain FLAME meshes and 2D landmark files for frames and a 'dataset.json'. Please put all these driving files into a same folder for reenactment later.

### Reenacting samples

```.bash
python reenact_avatar_next3d.py --drive_root=data/obama \
--network=pretrained_models/next3d_ffhq_512.pkl \
--grid=2x1 --seeds=166 --outdir=out --fname=reenact.mp4 \
--trunc=0.7 --lms_cond=1
```

## Training

Download and process [Flickr-Faces-HQ dataset](https://github.com/NVlabs/ffhq-dataset) using the following commands.
```.bash
cd dataset_preprocessing/ffhq
python runme.py
```
You can perform FLAME and landmarks estimation referring to [preprocess_in_the_wild.py](./dataset_preprocessing/ffhq/preprocess_in_the_wild.py). We will also integrate all the preprocessing steps into a script soon.
The dataset should be organized as below:
```
├── /path/to/dataset
│ ├── meshes512x512
│ ├── lms512x512
│ ├── images512x512
│ │ ├── 00000
├──img00000000.png
│ │ ├── ...
│ │ ├── dataset.json
```

You can train new networks using `train_next3d.py`. For example:

```.bash
# Train with FFHQ on the top of EG3D with raw neural rendering resolution=64, using 8 GPUs.
python train_next3d.py --outdir=~/training-runs --cfg=ffhq --data=data/ffhq/images512x512 \
--rdata data/ffhq/meshes512x512 --gpus=8 --batch=32 --gamma=4 --topology_path=data/demo/head_template.obj \
--gen_pose_cond=True --gen_exp_cond=True --disc_c_noise=1 --load_lms=True --model_version=next3d \
--resume pretrained_models/ffhqrebalanced512-64.pkl
```
Note that rendering-conditioned discriminator is not supported currently because obtaining rendering is still time-consuming. We are trying to accelerate this process and the training code will keep updating.

## One-shot portrait reenactment and stylization

Code will come soon.

## Citation

```
@inproceedings{sun2023next3d,
author = {Sun, Jingxiang and Wang, Xuan and Wang, Lizhen and Li, Xiaoyu and Zhang, Yong and Zhang, Hongwen and Liu, Yebin},
title = {Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars},
booktitle = {CVPR},
year = {2023}
}
```

```
@inproceedings{10.1145/3641519.3657478,
author = {Zhao, Xiaochen and Sun, Jingxiang and Wang, Lizhen and Suo, Jinli and Liu, Yebin},
title = {InvertAvatar: Incremental GAN Inversion for Generalized Head Avatars},
year = {2024},
isbn = {9798400705250},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3641519.3657478},
doi = {10.1145/3641519.3657478},
booktitle = {ACM SIGGRAPH 2024 Conference Papers},
articleno = {59},
numpages = {10},
keywords = {3D head avatar, GAN inversion, few-shot reconstruction, one-shot reconstruction, recurrent neural network},
location = {Denver, CO, USA},
series = {SIGGRAPH '24}
}

```

## Acknowledgements

Part of the code is borrowed from [EG3D](https://github.com/NVlabs/eg3d) and [DECA](https://github.com/yfeng95/DECA).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mrtornado24/next3d

Awesome Lists containing this project

README