Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hzhao1997/HF-Avatar

Last synced: about 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/hzhao1997/HF-Avatar
Owner: hzhao1997
License: other
Created: 2022-03-08T01:29:00.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2023-01-10T08:01:34.000Z (almost 2 years ago)
Last Synced: 2024-08-04T22:15:23.082Z (5 months ago)
Language: Python
Size: 2.81 MB
Stars: 123
Watchers: 16
Forks: 13
Open Issues: 17
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # High-Fidelity Human Avatars from a Single RGB Camera

### [Project Page](http://cic.tju.edu.cn/faculty/likun/projects/HF-Avatar/)  | [Paper](http://cic.tju.edu.cn/faculty/likun/projects/HF-Avatar/assets/main.pdf) | [Supp](http://cic.tju.edu.cn/faculty/likun/projects/HF-Avatar/assets/supp.pdf)

# News 

* There was a problem with pose initialization in the previous version, which causes poor texture quality. Currently, I update the code, and this problem should have been solved.

# Installation

```

conda create -n Avatar python==3.6.8

conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=11.0 -c pytorch

or conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch

pip install -r requirements.txt

wget https://github.com/facebookresearch/pytorch3d/archive/refs/tags/v0.4.0.zip

cd pytorch3d

pip install -e .

cd thirdparty/neural_renderer_pytorch

python setup.py install 

```

Please make sure your gcc version > 7.5 !

Download the assets files from [here](https://drive.google.com/file/d/1QV41Q9uBE_Xp2r34MvSABjBmq6c26c0U/view?usp=share_link), unzip it, and move them to the `assets` folder. 

Download the pre-trained model from [here](https://drive.google.com/file/d/1ykPUjFqgTjgGmMpOHD17RFMYxU0ZX4E8/view?usp=share_link), unzip it, and move them to the `checkpoints` folder.

Besides, we adopt the pose initialization of [octopus](https://github.com/thmoa/octopus). But the deep learning framework of octopus is not the same as our work, Therefore, you need to create a new conda environment for octopus.

```

conda create -n octopus python==2.7

conda install tensorflow-gpu=1.13.1 keras=2.2.4 cudatoolkit=10.0

```

The environment of octopus needs [dirt](https://github.com/pmh47/dirt). We recommend you install dirt by:

```

cd dirt

mkdir build ; cd build

cmake ../csrc

make

cd ..

pip install -e .

```

and you need to adjust the parameter [arch](https://github.com/pmh47/dirt/blob/95f58504c1ccf70b0d0502de81821842bc19ffd2/csrc/CMakeLists.txt#L41) according to your graphics card, or the compile may fail.

# Data Preparation

The size of frame is set to 1024x1024 uniformly. We fill the input frame by:

```

def fill_frame(frame):

    h, w = frame.shape[0], frame.shape[1]

    if h > w:

	 _pad = np.zeros([h, int((h - w) / 2), 3])

	 frame = np.concatenate([_pad, frame, _pad], axis=1)

    elif h < w:

	 _pad = np.zeros([int((w - h) / 2), w, 3])

	 frame = np.concatenate([_pad, frame, _pad], axis=0)

	 frame = np.transpose(frame, [1, 0, 2])

    frame = cv2.resize(frame, (1024, 1024))

    return frame

```

The proportion of people in the image should not be too small. If you cannot guarantee the proportion of people during recording, you had better crop the image before filling the frame. We crop the frame by:

```

def crop_frame(frame, bounding_box):

    # bounding_box['y_min'], bounding_box['y_max'], bounding_box['x_min'], bounding_box['x_max'] means the top, bottom, left, right position of human in the whole video.

    m_pixel = 30

    bounding_box['y_min'] = max(bounding_box['y_min'] - m_pixel, 0)

    bounding_box['y_max'] = min(bounding_box['y_max'] + m_pixel, h)

    bounding_box['x_min'] = max(bounding_box['x_min'] - m_pixel, 0)

    bounding_box['x_max'] = min(bounding_box['x_max'] + m_pixel, w)

    frame = frame[bounding_box['y_min']:bounding_box['y_max'],

		  bounding_box['x_min']:bounding_box['x_max']]

    return frame

```

Then you need to run [Openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose), [PifuHD](https://github.com/facebookresearch/pifuhd) and [MODNet](https://github.com/ZHKKKe/MODNet) to generate 2d joints, normal and mask to train our model. 

Then the generated data should be organized as follows:

```

--data_dir

----frames_mat

------subject_name

----2d_joints

------subject_name

--------json

----mask_mat

------subject_name

----normal

------subject_name

```

We provide the sample data in this [link](https://drive.google.com/file/d/1CY2ABZKFdLYFV64E_KFXW87rNhkYDRVT/view?usp=sharing).

# Usage

First, the pose initialization by running:

```

cd thirdparty/octopus 

python _infer_single.py --root_dir $data_dir --name $subject_name

```

Then, to generate initial geometry by running:

```

python dynamic_offsets_runner.py --root_dir $data_dir --name $subject_name --device_id $device_id

```

Finally, to generate the texture map by running:

```

python texture_generation.py --root_dir $data_dir --name $subject_name --device_id $device_id

```

# Citation

If you find our work useful in your research, please consider citing:

```

@inproceedings{zhao2022avatar,

  author = {Hao Zhao and Jinsong Zhang and Yu-Kun Lai and Zerong Zheng and Yingdi Xie and Yebin Liu and Kun Li},

  title = {High-Fidelity Human Avatars from a Single RGB Camera},

  booktitle = {CVPR},

  year={2022},

}

```

# Acknowlegement

We borrow some code from [NeuralTexture](https://github.com/SSRSGJYD/NeuralTexture), [LWG](https://github.com/svip-lab/impersonator). Thanks for their great contribtuions.