Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://yzhq97.github.io/transmomo
This is the official PyTorch implementation of the CVPR 2020 paper "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting".
https://yzhq97.github.io/transmomo
cvpr dance generative-adversarial-network motion-retargeting unsupervised-learning
Last synced: 3 months ago
JSON representation
This is the official PyTorch implementation of the CVPR 2020 paper "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting".
- Host: GitHub
- URL: https://yzhq97.github.io/transmomo
- Owner: yzhq97
- Created: 2020-04-01T01:22:33.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2021-03-18T11:23:55.000Z (over 3 years ago)
- Last Synced: 2024-03-30T06:41:44.170Z (3 months ago)
- Topics: cvpr, dance, generative-adversarial-network, motion-retargeting, unsupervised-learning
- Language: Python
- Homepage: https://yzhq97.github.io/transmomo/
- Size: 76.2 KB
- Stars: 381
- Watchers: 22
- Forks: 54
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
Lists
- awesome-human-pose-estimation - TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting - Yang, Zhuoqian and Zhu, Wentao and Wu, Wayne and Qian, Chen and Zhou, Qiang and Zhou, Bolei and Loy, Chen Change (CVPR 2020) (Papers / Video pose)
README
# TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting
![Python](https://img.shields.io/badge/Python->=3.6-Blue?logo=python) ![Pytorch](https://img.shields.io/badge/PyTorch->=1.1.0-Orange?logo=pytorch)
### [Project Page](https://yzhq97.github.io/transmomo/) | [YouTube](https://youtu.be/akbRtnRMkMk) | [Paper](https://arxiv.org/pdf/2003.14401.pdf)
This is the official PyTorch implementation of the CVPR 2020 paper "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting".
## Environment
```
conda install pytorch torchvision cudatoolkit=
conda install pyyaml scikit-image scikit-learn opencv
pip install -r requirements.txt
```## Data
### Mixamo
[Mixamo](https://www.mixamo.com/) is a synthesized 3D character animation dataset.
1. Download mixamo data [here](https://drive.google.com/file/d/1lMa-4Bspn2_XV4wqo_s9Bfa35-19UAkB/view?usp=sharing).
2. Extract under `data/mixamo`For directions for downloading 3D Mixamo data please refer to [this link](https://github.com/ChrisWu1997/2D-Motion-Retargeting/blob/master/dataset/Guide%20For%20Downloading%20Mixamo%20Data.md).
### SoloDance
SoloDance is a collection of dancing videos on youtube. We use [DensePose](https://github.com/facebookresearch/DensePose) to extract skeleton sequences from these videos for training.
1. Download the extracted skeleton sequences [here](https://drive.google.com/file/d/1366FaH0W2VYVW26ZbQJUp1x5GgMyMXuo/view?usp=sharing).
2. Extract under `data/solo_dance`The original videos can be downloaded [here](https://drive.google.com/drive/folders/1hBj2uVJGABZz2aiqVYJpJ4SqBhYT-kYz?usp=sharing).
### Preprocessing
run `sh scripts/preprocess.sh` to preprocess the two datasets above.## Pretrained model
Download the pretrained models [here](https://drive.google.com/drive/folders/1xZ2Pw7ObrDUIH89ipH1diyFZJxeXNDd8?usp=sharing).
## Inference
1. For *Skeleton Extraction*, please consider using a pose estimation library such as [Detectron2](https://github.com/facebookresearch/detectron2). We require the input skeleton sequences to be in the format of a numpy `.npy` file:
- The file should contain an array with shape `15 x 2 x length`.
- The first dimension (15) corresponds the 15 body joint defined [here](https://github.com/yzhq97/transmomo.pytorch/blob/master/docs/keypoint_format.md).
- The second dimension (2) corresponds to x and y coordinates.
- The third dimension (length) is the temporal dimension.2. For *Motion Retargeting Network*, we provide the sample command for inference:
```shell script
python infer_pair.py
--config configs/transmomo.yaml
--checkpoint transmomo_mixamo_36_800_24/checkpoints/autoencoder_00200000.pt # replace with actual path
--source a.npy # replace with actual path
--target b.npy # replace with actual path
--source_width 1280 --source_height 720
--target_height 1920 --target_width 1080
```3. For *Skeleton-to-Video Rendering*, please refer to [Everybody Dance Now](https://carolineec.github.io/everybody_dance_now/).
## Training
To train the *Motion Retargeting Network*, run
```shell script
python train.py --config configs/transmomo.yaml
```
To train on the SoloDance dataest, run
```shell script
python train.py --config configs/transmomo_solo_dance.yaml
```## Testing
For testing motion retargeting MSE, first generate the motion-retargeted motions with
```shell script
python test.py
--config configs/transmomo.yaml # replace with the actual config used for training
--checkpoint transmomo_mixamo_36_800_24/checkpoints/autoencoder_00200000.pt
--out_dir transmomo_mixamo_36_800_24_results # replace actual path to output directory
```
And then compute MSE by
```shell script
python scripts/compute_mse.py
--in_dir transmomo_mixamo_36_800_24_results # replace with the previous output directory
```## Project Structure
```
transmomo.pytorch
├── configs - configuration files
├── data - place for storing data
├── docs - documentations
├── lib
│ ├── data.py - datasets and dataLoaders
│ ├── networks - encoders, decoders, discriminators, etc.
│ ├── trainer.py - training pipeline
│ ├── loss.py - loss functions
│ ├── operation.py - operations, e.g. rotation, projection, etc.
│ └── util - utility functions
├── out - place for storing output
├── infer_pair.py - perform motion retargeting
├── render_interpolate.py - perform motion and body interpolation
├── scripts - scripts for data processing and experiments
├── test.py - test MSE
└── train.py - main entrance for training
```## TODOs
- [x] Detailed documentation
- [ ] Add example files
- [x] Release in-the-wild dancing video dataset (unannotated)
- [ ] Tool for visualizing Mixamo test error
- [ ] Tool for converting keypoint formats
## Citation
Z. Yang*, W. Zhu*, W. Wu*, C. Qian, Q. Zhou, B. Zhou, C. C. Loy. "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting." IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. (* indicates equal contribution.)
BibTeX:
```bibtex
@inproceedings{transmomo2020,
title={TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting},
author={Yang, Zhuoqian and Zhu, Wentao and Wu, Wayne and Qian, Chen and Zhou, Qiang and Zhou, Bolei and Loy, Chen Change},
booktitle={The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2020}
}
```## Acknowledgement
This repository is partly based on Rundi Wu's [Learning Character-Agnostic Motion for Motion Retargeting in 2D](https://motionretargeting2d.github.io/) and Xun Huang's [MUNIT: Multimodal UNsupervised Image-to-image Translation](https://github.com/NVlabs/MUNIT). The skeleton-to-rendering part is based on [Everybody Dance Now](https://carolineec.github.io/everybody_dance_now/). We sincerely thank them for their inspiration and contribution to the community.