Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hzhao1997/HF-Avatar


https://github.com/hzhao1997/HF-Avatar

Last synced: about 2 months ago
JSON representation

Awesome Lists containing this project

README

        

# High-Fidelity Human Avatars from a Single RGB Camera
### [Project Page](http://cic.tju.edu.cn/faculty/likun/projects/HF-Avatar/) | [Paper](http://cic.tju.edu.cn/faculty/likun/projects/HF-Avatar/assets/main.pdf) | [Supp](http://cic.tju.edu.cn/faculty/likun/projects/HF-Avatar/assets/supp.pdf)

# News
* There was a problem with pose initialization in the previous version, which causes poor texture quality. Currently, I update the code, and this problem should have been solved.

# Installation

```
conda create -n Avatar python==3.6.8
conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=11.0 -c pytorch
or conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch

pip install -r requirements.txt

wget https://github.com/facebookresearch/pytorch3d/archive/refs/tags/v0.4.0.zip
cd pytorch3d
pip install -e .

cd thirdparty/neural_renderer_pytorch
python setup.py install

```
Please make sure your gcc version > 7.5 !

Download the assets files from [here](https://drive.google.com/file/d/1QV41Q9uBE_Xp2r34MvSABjBmq6c26c0U/view?usp=share_link), unzip it, and move them to the `assets` folder.

Download the pre-trained model from [here](https://drive.google.com/file/d/1ykPUjFqgTjgGmMpOHD17RFMYxU0ZX4E8/view?usp=share_link), unzip it, and move them to the `checkpoints` folder.

Besides, we adopt the pose initialization of [octopus](https://github.com/thmoa/octopus). But the deep learning framework of octopus is not the same as our work, Therefore, you need to create a new conda environment for octopus.
```
conda create -n octopus python==2.7
conda install tensorflow-gpu=1.13.1 keras=2.2.4 cudatoolkit=10.0
```
The environment of octopus needs [dirt](https://github.com/pmh47/dirt). We recommend you install dirt by:
```
cd dirt
mkdir build ; cd build
cmake ../csrc
make
cd ..
pip install -e .
```
and you need to adjust the parameter [arch](https://github.com/pmh47/dirt/blob/95f58504c1ccf70b0d0502de81821842bc19ffd2/csrc/CMakeLists.txt#L41) according to your graphics card, or the compile may fail.

# Data Preparation
The size of frame is set to 1024x1024 uniformly. We fill the input frame by:
```
def fill_frame(frame):
h, w = frame.shape[0], frame.shape[1]
if h > w:
_pad = np.zeros([h, int((h - w) / 2), 3])
frame = np.concatenate([_pad, frame, _pad], axis=1)
elif h < w:
_pad = np.zeros([int((w - h) / 2), w, 3])
frame = np.concatenate([_pad, frame, _pad], axis=0)
frame = np.transpose(frame, [1, 0, 2])
frame = cv2.resize(frame, (1024, 1024))
return frame
```
The proportion of people in the image should not be too small. If you cannot guarantee the proportion of people during recording, you had better crop the image before filling the frame. We crop the frame by:
```
def crop_frame(frame, bounding_box):
# bounding_box['y_min'], bounding_box['y_max'], bounding_box['x_min'], bounding_box['x_max'] means the top, bottom, left, right position of human in the whole video.

m_pixel = 30
bounding_box['y_min'] = max(bounding_box['y_min'] - m_pixel, 0)
bounding_box['y_max'] = min(bounding_box['y_max'] + m_pixel, h)
bounding_box['x_min'] = max(bounding_box['x_min'] - m_pixel, 0)
bounding_box['x_max'] = min(bounding_box['x_max'] + m_pixel, w)
frame = frame[bounding_box['y_min']:bounding_box['y_max'],
bounding_box['x_min']:bounding_box['x_max']]

return frame
```

Then you need to run [Openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose), [PifuHD](https://github.com/facebookresearch/pifuhd) and [MODNet](https://github.com/ZHKKKe/MODNet) to generate 2d joints, normal and mask to train our model.
Then the generated data should be organized as follows:
```
--data_dir
----frames_mat
------subject_name
----2d_joints
------subject_name
--------json
----mask_mat
------subject_name
----normal
------subject_name
```
We provide the sample data in this [link](https://drive.google.com/file/d/1CY2ABZKFdLYFV64E_KFXW87rNhkYDRVT/view?usp=sharing).

# Usage
First, the pose initialization by running:
```
cd thirdparty/octopus
python _infer_single.py --root_dir $data_dir --name $subject_name
```
Then, to generate initial geometry by running:
```
python dynamic_offsets_runner.py --root_dir $data_dir --name $subject_name --device_id $device_id
```
Finally, to generate the texture map by running:
```
python texture_generation.py --root_dir $data_dir --name $subject_name --device_id $device_id
```

# Citation
If you find our work useful in your research, please consider citing:
```
@inproceedings{zhao2022avatar,
author = {Hao Zhao and Jinsong Zhang and Yu-Kun Lai and Zerong Zheng and Yingdi Xie and Yebin Liu and Kun Li},
title = {High-Fidelity Human Avatars from a Single RGB Camera},
booktitle = {CVPR},
year={2022},
}
```

# Acknowlegement
We borrow some code from [NeuralTexture](https://github.com/SSRSGJYD/NeuralTexture), [LWG](https://github.com/svip-lab/impersonator). Thanks for their great contribtuions.