https://github.com/compvis/behavior-driven-video-synthesis

Last synced: about 1 year ago
JSON representation

Host: GitHub
URL: https://github.com/compvis/behavior-driven-video-synthesis
Owner: CompVis
License: apache-2.0
Created: 2021-03-04T07:16:03.000Z (about 5 years ago)
Default Branch: main
Last Pushed: 2022-12-15T15:52:28.000Z (over 3 years ago)
Last Synced: 2025-03-21T12:07:03.158Z (about 1 year ago)
Language: Python
Size: 67.6 MB
Stars: 27
Watchers: 8
Forks: 8
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Behavior-Driven Synthesis of Human Dynamics
Official PyTorch implementation of Behavior-Driven Synthesis of Human Dynamics.
## [Arxiv](https://arxiv.org/abs/2103.04677) | [Project Page](https://compvis.github.io/behavior-driven-video-synthesis/) | [BibTeX](#bibtex)

[Andreas Blattmann](https://www.linkedin.com/in/andreas-blattmann-479038186/?originalSubdomain=de)\*,
[Timo Milbich](https://timomilbich.github.io/)\*,
[Michael Dorkenwald](https://mdork.github.io/)\*,
[Björn Ommer](https://hci.iwr.uni-heidelberg.de/Staff/bommer),
[CVPR 2021](http://cvpr2021.thecvf.com/)

\* equal contribution

![teaser_vid](assets/transfer_example_reduced.gif)

**TL;DR:** Our approach for human behavior transfer: Given a source sequence of human dynamics our model infers a behavior encoding which is independent of posture. We can re-enact the behavior by combining it with an unrelated target posture and thus control the synthesis process. The resulting sequence is combined with an appearance to synthesize a video sequence

![teaser](assets/first-page.png "Method pipeline")

## Requirements
After cloning the repository, a suitable [conda](https://conda.io/) environment named `behavior_driven_synthesis` can be created
and activated with:

```
$ cd behavior-driven-video-synthesis
$ conda env create -f environment.yaml
$ conda activate behavior_driven_synthesis
```

## Data preparation

### Human3.6m

The [Human3.6m-Dataset](http://vision.imar.ro/human3.6m/description.php) is the main dataset for evaluating the capbilities of our model. Prior to downloading the data, you'll have to [create an account](https://vision.imar.ro/human3.6m/main_login.php). As soon as this is done, download the `.tar.gz`-archives containing the videos and 3d pose annotations for each subject. You don't need to download all 3d pose annotations but only the ones named `D3_Positions_mono_universal`.

The root directory of the downloaded data will hereafter be refered to as ``. In this directory, create a folder `archives`, save all the downloaded archives therein and execute the extraction and processing scripts from the root of this directory
```shell script
$ python -m data.extract_tars --datadir
$ ... the script creates a subdirectory 'extracted', for the extracted archives...
$ python -m data.process --datadir
```

After that, the `archives`- and `extracted`-directories can be deleted. The data-containing directory `` should then have the following structure:
```
$
├── processed
├── all
├── S1 #
├── Directions-1 # -
├── ImageSequence
├── 54138969 #
├── img_000001.jpg
├── img_000002.jpg
├── img_000003.jpg
├── ...
├── 55011271
├── ...
├── Directions-2
├── ...
├── S2
├── ...
├── ...
├── ...
```

The after that, download and extract the [preprocessed annotion file](https://heibox.uni-heidelberg.de/f/733220993b1449dc99db/) to ``.

### DeepFashion and Market

Download the archives `deepfashion.tar.gz` and `market.tar.gz` from [here](https://heibox.uni-heidelberg.de/d/71842715a8/?p=%2Fvunet&mode=list) and unpack the datasets in two distinct directories (they will later be refered to as `` and ``).

## Training

### Behavior net

To train our final behavior model from scratch, you have to adjust the sub-field `data: datapath` in the accompanying file `config/behavior_net.yaml` such that it contains the path to ``. Otherwise, the data will not be found. Apart from this, you can change the name of your run by adapting the field `general: project_name`. Lastly, all the logs, configs and checkpoints will be stored in the path specified in `general: base_dir`, which is by default the root of the cloned repository. We recommend to use the same `base_dir` for all behavior models.

After adjusting the configuration file, you can start a training run via
```shell script
$ python main.py --config config/behavior_net.yaml --gpu
```

This will train our presented cVAE model in a first stage, prior to optimizing the parameters of the proposed normalizing flow model.

If intending to use a pretrained cVAE model and train additional normalizing flow models, you can simply set the field `general: project_name` to the `project_name` of the pretrained cVAE and enable flow training via
```shell script
$ python main.py --config config/behavior_net.yaml --gpu --flow
```

To resume a cVAE model from the latest checkpoin, again specify the `project_name` of the run to restart and use
```shell script
$ python main.py --config config/behavior_net.yaml --gpu --restart
```

### Shape-and-posture net

Depending on the dataset you want use for training, some fields of the configuration file `config/shape_and_pose_net.yaml` have to be adjusted according to the following table:

| Field Name | Human3.6m | DeepFashion | Market1501
| ------------- | ------------- |------------- | ------------- |
| `data: dataset` | `Human3.6m` | `DeepFashion` | `Market` |
| `data: datapath` | `` | `` | `` |
| `data: inplane_normalize` | `False` | `True` | `True` |
| `data: spatial_size` | `256` | `256` | `128` |
| `data: bottleneck_factor` | `2` | `2` | `1` |
| `data: box` | `2` | `2` | `1` |

After that, training can be started via
```shell script
$ python main.py --config config/shape_and_pose_net.yaml --gpu
```

Similar to the behavior model, a training run can be resumed by changing the value of `general: project_name` to the name of this run and then using
```shell script
$ python main.py --config config/shape_and_pose_net.yaml --gpu --restart
```

## Pretrained models and evaluation

The weights of all our pretrained final models can be downloaded from [this link](https://heibox.uni-heidelberg.de/d/7f34bca58c094d5595de/). Save the checkpoints together with the respective hyperparameters (which are contained in the files `config.yaml`) for each unique model in a unique directory. Evaluation can then be started via the command
```shell
$ python main.py --pretrained_model --gpu --config
```
where `` is `config/behavior_net.yaml` for the pretrained behavior model and `config/shape_and_pose_net.yaml` for on of the pretrained shape-and-posture models.

To evaluate a model which was trained from scratch, simply set the field `project_name` in the respective `` to be the name of the model to be evaluated (similar to the procedure for resuming training) and start evaluation via
```shell
$ python main.py --gpu --config --mode infer
```
where `` is again `config/behavior_net.yaml` for a behavior model and `config/shape_and_pose_net.yaml` for a shape-and-posture model.

## BibTeX

```
@misc{blattmann2021behaviordriven,
title={Behavior-Driven Synthesis of Human Dynamics},
author={Andreas Blattmann and Timo Milbich and Michael Dorkenwald and Björn Ommer},
year={2021},
eprint={2103.04677},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/compvis/behavior-driven-video-synthesis

Awesome Lists containing this project

README