Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/viiika/Diffusion-Conductor
Taming Diffusion Models for Music-driven Conducting Motion Generation
https://github.com/viiika/Diffusion-Conductor
Last synced: 2 months ago
JSON representation
Taming Diffusion Models for Music-driven Conducting Motion Generation
- Host: GitHub
- URL: https://github.com/viiika/Diffusion-Conductor
- Owner: viiika
- Created: 2023-04-18T09:10:16.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-05-09T04:26:13.000Z (8 months ago)
- Last Synced: 2024-08-03T04:06:11.992Z (5 months ago)
- Language: Python
- Homepage:
- Size: 27.4 MB
- Stars: 23
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Taming Diffusion Models for Music-driven Conducting Motion Generation
Accepted by AAAI 2023 Summer Symposium, with **Best Paper Award**.
## Overview
![](./assets/images/visualization.png)
- Generated conducting motion according to the given music -- Tchaikovsky Piano Concerto No.1:
https://github.com/viiika/Diffusion-Conductor/assets/40078051/d993df28-29a0-4520-a429-19fb2cc0a546
### Features
- Objective: We present **Diffusion-Conductor**, a novel DDIM-based approach for music-driven conducting motion generation.
- Contributions:
- First work to use diffusion model for
music-driven conducting motion generation.
- Modify the supervision signal from `ε` to `x0` to achieve
the better performances, which will inspire later research on motion generation field.
- Benchmark Performance: Ourperform state-of-the-art methods on all four metrics: MSE, FGD, BC, Diversity.## News
- 18/07/2023: Our paper won the Best Paper Award for AAAI 2023 Inangural Summer Symposium!
## Getting Started
### Installation
Please refer to [install.md](/Diffusion_Stage/install.md) for detailed installation.
### Training
#### Prepare the ConductorMotion100 dataset:
- The training set:https://pan.baidu.com/s/1Pmtr7V7-9ChJqQp04NOyZg?pwd=3209
- The validation set:https://pan.baidu.com/s/1B5JrZnFCFvI9ABkuJeWoFQ?pwd=3209
- The test set:https://pan.baidu.com/s/18ecHYk9b4YM5YTcBNn37qQ?pwd=3209You can also access the dataset via [**Google Drive**](https://drive.google.com/drive/folders/1I2eFM-vEbqVXtD4sUPmGFSeNZeu_5JMu?usp=sharing)
There are 3 splits of *ConductorMotion100*: train, val, and test. They respectively correspond to 3 `.rar` files. After extract them to `` folder, the file structure will be:
```
tree├───train
│ ├───0
│ │ mel.npy
│ │ motion.npy
| ...
│ └───5268
│ mel.npy
│ motion.npy
├───val
│ ├───0
│ │ mel.npy
│ │ motion.npy
| ...
│ └───290
│ mel.npy
│ motion.npy
└───test
├───0
│ mel.npy
│ motion.npy
...
└───293
mel.npy
motion.npy
```Each `mel.npy` and `motion.npy` are corresponded to 60 seconds of Mel spectrogram and motion data. Their sampling rates are respectively 90 Hz and 30 Hz. The Mel spectrogram has 128 frequency bins, therefore `mel.shape = (5400, 128)`. The motion data contains 13 2d keypoints, therefore `motion.shape = (1800, 13, 2)`
#### Train the music encoder and motion encoder in Contrastive_Stage with the following command:
```shell
cd Contrastive_Stage
``````
python M2SNet_train.py --dataset_dir
```#### Train the diffusion model in Diffusion_Stage with the following command:
```shell
cd Diffusion_Stage
```
```shell
PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
python3 -u tools/train.py \
--name checkpoint_folder_name \
--batch_size 32 \
--times 25 \
--num_epochs 400 \
--dataset_name ConductorMotion100 \
--data_parallel \
--gpu_id 1 2
```### Inference and Visualization
```shell
cd Diffusion_Stage
```
```shell
PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
python -u tools/visualization.py \
--motion_length 6 \
--gpu_id 5 \
--result_path "conduct_example.mp4"
```### Download the pretrained model
For evaluation and inference, you may download the contrastive stage pretrained model and the diffusion stage pretrained model from [GoogleDrive](https://drive.google.com/drive/folders/1l2jvAudk6w5UuAKH3ZMM20qLChmkegb2?usp=drive_link).## Acknowledgement
We would like to thank to the great projects in [VirtualConductor](https://github.com/ChenDelong1999/VirtualConductor) and [MotionDiffuse](https://github.com/mingyuan-zhang/MotionDiffuse).## Papers
1. Zhuoran Zhao and Jinbin Bai* and Delong Chen and Debang Wang and Yubo Pan. [Taming Diffusion Models for Music-driven Conducting Motion Generation](https://arxiv.org/abs/2306.10065)
```bibtex
@inproceedings{zhao2023taming,
title={Taming diffusion models for music-driven conducting motion generation},
author={Zhao, Zhuoran and Bai, Jinbin and Chen, Delong and Wang, Debang and Pan, Yubo},
booktitle={Proceedings of the AAAI Symposium Series},
volume={1},
number={1},
pages={40--44},
year={2023}
}
```