{"id":18317343,"url":"https://github.com/compvis/behavior-driven-video-synthesis","last_synced_at":"2025-04-05T21:32:21.764Z","repository":{"id":42199370,"uuid":"344384079","full_name":"CompVis/behavior-driven-video-synthesis","owner":"CompVis","description":null,"archived":false,"fork":false,"pushed_at":"2022-12-15T15:52:28.000Z","size":70927,"stargazers_count":27,"open_issues_count":1,"forks_count":8,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-03-21T12:07:03.158Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CompVis.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-03-04T07:16:03.000Z","updated_at":"2024-07-11T00:48:15.000Z","dependencies_parsed_at":"2023-01-29T03:30:58.719Z","dependency_job_id":null,"html_url":"https://github.com/CompVis/behavior-driven-video-synthesis","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompVis%2Fbehavior-driven-video-synthesis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompVis%2Fbehavior-driven-video-synthesis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompVis%2Fbehavior-driven-video-synthesis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompVis%2Fbehavior-driven-video-synthesis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CompVis","download_url":"https://codeload.github.com/CompVis/behavior-driven-video-synthesis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247406080,"owners_count":20933803,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-05T18:05:50.382Z","updated_at":"2025-04-05T21:32:16.749Z","avatar_url":"https://github.com/CompVis.png","language":"Python","readme":"# Behavior-Driven Synthesis of Human Dynamics\nOfficial PyTorch implementation of Behavior-Driven Synthesis of Human Dynamics.\n## [Arxiv](https://arxiv.org/abs/2103.04677) | [Project Page](https://compvis.github.io/behavior-driven-video-synthesis/) | [BibTeX](#bibtex)\n\n[Andreas Blattmann](https://www.linkedin.com/in/andreas-blattmann-479038186/?originalSubdomain=de)\\*,\n[Timo Milbich](https://timomilbich.github.io/)\\*,\n[Michael Dorkenwald](https://mdork.github.io/)\\*,\n[Björn Ommer](https://hci.iwr.uni-heidelberg.de/Staff/bommer),\n[CVPR 2021](http://cvpr2021.thecvf.com/)\u003cbr/\u003e\n\\* equal contribution\n\n\n![teaser_vid](assets/transfer_example_reduced.gif)\n\n**TL;DR:** Our approach for human behavior transfer: Given a source sequence of human dynamics our model infers a behavior encoding which is independent of posture. We can re-enact the behavior by combining it with an unrelated target posture and thus control the synthesis process. The resulting sequence is combined with an appearance to synthesize a video sequence\n\n![teaser](assets/first-page.png \"Method pipeline\")\n\n\n## Requirements\nAfter cloning the repository, a suitable [conda](https://conda.io/) environment named `behavior_driven_synthesis` can be created\nand activated with:\n\n```\n$ cd behavior-driven-video-synthesis\n$ conda env create -f environment.yaml\n$ conda activate behavior_driven_synthesis\n```\n\n## Data preparation\n\n### Human3.6m\n\nThe [Human3.6m-Dataset](http://vision.imar.ro/human3.6m/description.php) is the main dataset for evaluating the capbilities of our model. Prior to downloading the data, you'll have to [create an account](https://vision.imar.ro/human3.6m/main_login.php). As soon as this is done, download the `.tar.gz`-archives containing the videos and 3d pose annotations for each subject. You don't need to download all 3d pose annotations but only the ones named `D3_Positions_mono_universal`.\n \nThe root directory of the downloaded data will hereafter be refered to as `\u003cDATADIR_H36M\u003e`. In this directory, create a folder `archives`, save all the downloaded archives therein and execute the extraction and processing scripts from the root of this directory \n```shell script\n$ python -m data.extract_tars --datadir \u003cDATADIR_H36M\u003e\n$ ... the script creates a subdirectory 'extracted', for the extracted archives...\n$ python -m data.process --datadir \u003cDATADIR_H36M\u003e\n``` \n\n \nAfter that, the `archives`- and `extracted`-directories can be deleted. The data-containing directory `\u003cDATADIR_H36M\u003e` should then have the following structure:\n```\n$\u003cDATADIR_H36M\u003e\n├── processed\n    ├── all    \n        ├── S1 # \u003csubject_id\u003e\n            ├── Directions-1 # \u003caction_id\u003e-\u003csubaction_id\u003e\n                ├── ImageSequence\n                    ├── 54138969 # \u003ccamera_id\u003e\n                        ├── img_000001.jpg\n                        ├── img_000002.jpg\n                        ├── img_000003.jpg\n                        ├── ... \n                    ├── 55011271 \n                        ├── ... \n            ├── Directions-2\n                ├── ... \n        ├── S2\n            ├── ...\n        ├── ... \n├── ...\n```\n\nThe after that, download and extract the [preprocessed annotion file](https://heibox.uni-heidelberg.de/f/733220993b1449dc99db/) to `\u003cDATADIR_H36M/processed/all\u003e`.   \n\n\n### DeepFashion and Market\n\nDownload the archives `deepfashion.tar.gz` and `market.tar.gz` from [here](https://heibox.uni-heidelberg.de/d/71842715a8/?p=%2Fvunet\u0026mode=list) and unpack the datasets in two distinct directories (they will later be refered to as `\u003cDATADIR_DEEPFASHION\u003e` and `\u003cDATADIR_MARKET\u003e`).\n\n## Training\n\n### Behavior net\n\nTo train our final behavior model from scratch, you have to adjust the sub-field `data: datapath` in the accompanying file `config/behavior_net.yaml` such that it contains the path to `\u003cDATADIR_H36M\u003e`. Otherwise, the data will not be found. Apart from this, you can change the name of your run by adapting the field `general: project_name`. Lastly, all the logs, configs and checkpoints will be stored in the path specified in `general: base_dir`, which is by default the root of the cloned repository. We recommend to use the same `base_dir` for all behavior models.  \n\n\nAfter adjusting the configuration file, you can start a training run via \n```shell script\n$ python main.py --config config/behavior_net.yaml --gpu \u003cgpu_id\u003e\n```\n\nThis will train our presented cVAE model in a first stage, prior to optimizing the parameters of the proposed normalizing flow model. \n\nIf intending to use a pretrained cVAE model and train additional normalizing flow models, you can simply set the field `general: project_name` to the `project_name` of the pretrained cVAE and enable flow training via\n```shell script\n$ python main.py --config config/behavior_net.yaml --gpu \u003cgpu_id\u003e --flow\n```\n\nTo resume a cVAE model from the latest checkpoin, again specify the `project_name` of the run to restart and use\n```shell script\n$ python main.py --config config/behavior_net.yaml --gpu \u003cgpu_id\u003e --restart\n```\n\n\n### Shape-and-posture net\n\nDepending on the dataset you want use for training, some fields of the configuration file `config/shape_and_pose_net.yaml` have to be adjusted according to the following table:\n\n| Field Name  | Human3.6m | DeepFashion |  Market1501 \n| ------------- | ------------- |-------------  | -------------  |\n| `data: dataset` | `Human3.6m` | `DeepFashion` | `Market` |\n| `data: datapath` | `\u003cDATADIR_H36M\u003e` | `\u003cDATADIR_DEEPFASHION\u003e` | `\u003cDATADIR_MARKET\u003e` |\n| `data: inplane_normalize` | `False` | `True` | `True` |\n| `data: spatial_size` | `256` | `256` | `128` |\n| `data: bottleneck_factor` | `2` | `2` | `1` |\n| `data: box` | `2` | `2` | `1` |\n\nAfter that, training can be started via \n```shell script\n$ python main.py --config config/shape_and_pose_net.yaml --gpu \u003cgpu_id\u003e\n```\n\nSimilar to the behavior model, a training run can be resumed by changing the value of `general: project_name` to the name of this run and then using \n ```shell script\n$ python main.py --config config/shape_and_pose_net.yaml --gpu \u003cgpu_id\u003e --restart\n```\n\n## Pretrained models and evaluation\n\nThe weights of all our pretrained final models can be downloaded from [this link](https://heibox.uni-heidelberg.de/d/7f34bca58c094d5595de/). Save the checkpoints together with the respective hyperparameters (which are contained in the files `config.yaml`) for each unique model in a unique directory. Evaluation can then be started via the command\n```shell\n$ python main.py --pretrained_model \u003cpath_to_model_directory\u003e --gpu \u003cgpu_id\u003e --config \u003cconfig_file\u003e\n```\nwhere `\u003cconfig_file\u003e` is `config/behavior_net.yaml` for the pretrained behavior model and `config/shape_and_pose_net.yaml` for on of the pretrained shape-and-posture models.\n\nTo evaluate a model which was trained from scratch, simply set the field `project_name` in the respective `\u003cconfig_file\u003e` to be the name of the model to be evaluated (similar to the procedure for resuming training) and start evaluation via \n```shell\n$ python main.py --gpu \u003cgpu_id\u003e --config \u003cconfig_file\u003e --mode infer\n```\nwhere `\u003cconfig_file\u003e` is again `config/behavior_net.yaml` for a behavior model and `config/shape_and_pose_net.yaml` for a shape-and-posture model.\n\n\n## BibTeX\n\n```\n@misc{blattmann2021behaviordriven,\n      title={Behavior-Driven Synthesis of Human Dynamics}, \n      author={Andreas Blattmann and Timo Milbich and Michael Dorkenwald and Björn Ommer},\n      year={2021},\n      eprint={2103.04677},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV}\n}\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcompvis%2Fbehavior-driven-video-synthesis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcompvis%2Fbehavior-driven-video-synthesis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcompvis%2Fbehavior-driven-video-synthesis/lists"}