https://github.com/ericguo5513/tm2t
Official implementation of "TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts (ECCV2022)"
https://github.com/ericguo5513/tm2t
motion-generation motion-generator motion-to-text pytorch-implementation text-to-motion
Last synced: 6 months ago
JSON representation
Official implementation of "TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts (ECCV2022)"
- Host: GitHub
- URL: https://github.com/ericguo5513/tm2t
- Owner: EricGuo5513
- License: mit
- Created: 2022-07-04T18:10:28.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-08-18T00:23:39.000Z (about 1 year ago)
- Last Synced: 2025-02-12T19:15:09.727Z (8 months ago)
- Topics: motion-generation, motion-generator, motion-to-text, pytorch-implementation, text-to-motion
- Language: Python
- Homepage: https://ericguo5513.github.io/TM2T
- Size: 15.2 MB
- Stars: 114
- Watchers: 6
- Forks: 14
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# TM2T: Stochastical and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts (ECCV 2022)
## [[Project Page]](https://ericguo5513.github.io/TM2T) [[Paper]](https://arxiv.org/abs/2207.01696.pdf)
## Python Virtual EnvironmentAnaconda is recommended to create this virtual environment.
```sh
conda create -f environment.yaml
conda activate tm2t
```
If you cannot successfully create the environment, here is a list of required libraries:
```
Python = 3.7.9 # Other version may also work but is not tested.
PyTorch = 1.6.0 (conda install pytorch==1.6.0 torchvision==0.7.0 -c pytorch) #Other version may also work but are not tested.
scipy
numpy
tensorflow # For use of tensorboard only
spacy
tqdm
ffmpeg = 4.3.1 # Other version may also work but are not tested.
matplotlib = 3.3.1
nlpeval (https://github.com/Maluuba/nlg-eval) # For evaluation of motion-to-text only
bertscore (https://github.com/Tiiiger/bert_score) # For evaluation of motion-to-text only
```
After all, if you want to generate 3D motions from customized raw texts, you still need to install the language model for spacy.
```sh
python -m spacy download en_core_web_sm
```
## Download Data & Pre-trained Models
**If you just want to play our pre-trained models, you don't need to download datasets.**
### Datasets
We are using two 3D human motion-language dataset: HumanML3D and KIT-ML. For both datasets, you could find the details as well as download link [[here]](https://github.com/EricGuo5513/HumanML3D).
Please note you don't need to clone that git repository, since all related codes have already been included in current git project.
Download and unzip the dataset files -> Create a dataset folder -> Place related data files in dataset folder:
```sh
mkdir ./dataset/
```
Take HumanML3D for an example, the file directory should look like this:
```
./dataset/
./dataset/HumanML3D/
./dataset/HumanML3D/new_joint_vecs/
./dataset/HumanML3D/texts/
./dataset/HumanML3D/Mean.mpy
./dataset/HumanML3D/Std.npy
./dataset/HumanML3D/test.txt
./dataset/HumanML3D/train.txt
./dataset/HumanML3D/train_val.txt
./dataset/HumanML3D/val.txt
./dataset/HumanML3D/all.txt
```
### Pre-trained Models
Create a checkpoint folder to place pre-traine models:
```sh
mkdir ./checkpoints
```
#### Download models for HumanML3D from [[here]](https://drive.google.com/file/d/1OXy2FBhXrswT6zE4SBSPpVfQhxmI8Zzy/view?usp=sharing). Unzip and place them under checkpoint directory, which should be like
```
./checkpoints/t2m/
./checkpoints/t2m/Comp_v6_KLD005/ # A dumb folder containing information for evaluation dataloading
./checkpoints/t2m/VQVAEV3_CB1024_CMT_H1024_NRES3/ # Motion discretizer
./checkpoints/t2m/M2T_EL4_DL4_NH8_PS/ # Motion (token)-to-Text translation model
./checkpoints/t2m/T2M_Seq2Seq_NML1_Ear_SME0_N/ # Text-to-Motion (token) generation model
./checkpoints/t2m/text_mot_match/ # Motion & Text feature extractors for evaluation
```
#### Download models for KIT-ML [[here]](https://drive.google.com/file/d/1ied_KWvqXXsP2Gls-SvzjXIZtHHZ5zpi/view?usp=sharing). Unzip and place them under checkpoint directory.
## Training Models
All intermediate meta files/animations/models will be saved to checkpoint directory under the folder specified by argument "--name".
### Training motion discretizer
#### HumanML3D
```sh
python train_vq_tokenizer_v3.py --gpu_id 0 --name VQVAEV3_CB1024_CMT_H1024_NRES3 --dataset_name t2m --n_resblk 3
```
#### KIT-ML
```sh
python train_vq_tokenizer_v3.py --gpu_id 0 --name VQVAEV3_CB1024_CMT_H1024_NRES3 --dataset_name kit --n_resblk 3
```
### Tokenizing all motion data for the following training
#### HumanML3D
```sh
python tokenize_script.py --gpu_id 0 --name VQVAEV3_CB1024_CMT_H1024_NRES3 --dataset_name t2m
```#### KIT-ML
```sh
python tokenize_script.py --gpu_id 0 --name VQVAEV3_CB1024_CMT_H1024_NRES3 --dataset_name kit
```### Training motion2text model:
#### HumanML3D
```sh
python train_m2t_transformer.py --gpu_id 0 --name M2T_EL4_DL4_NH8_PS --n_enc_layers 4 --n_dec_layers 4 --proj_share_weight --dataset_name t2m
```
#### KIT-ML
```sh
python train_m2t_transformer.py --gpu_id 0 --name M2T_EL3_DL3_NH8_PS --n_enc_layers 3 --n_dec_layers 3 --proj_share_weight --dataset_name kit
```
### Training text2motion model:
#### HumanML3D
```sh
python train_t2m_joint_seq2seq.py --gpu_id 0 --name T2M_Seq2Seq_NML1_Ear_SME0_N --start_m2t_ep 0 --dataset_name t2m
```
#### KIT-ML
```sh
python train_t2m_joint_seq2seq.py --gpu_id 0 --name T2M_Seq2Seq_NML1_Ear_SME0_N --start_m2t_ep 0 --dataset_name kit
```
### Motion & text feature extractors:
We use the same extractors provided by https://github.com/EricGuo5513/text-to-motion
## Generating and Animating 3D Motions (HumanML3D)
### Translating motions into langauge (using test set)
With Beam Search:
```sh
python evaluate_m2t_transformer.py --name M2T_EL4_DL4_NH8_PS --gpu_id 2 --num_results 20 --n_enc_layers 4 --n_dec_layers 4 --proj_share_weight --ext beam_search
```With Sampling:
```sh
python evaluate_m2t_transformer.py --name M2T_EL4_DL4_NH8_PS --gpu_id 2 --num_results 20 --n_enc_layers 4 --n_dec_layers 4 --proj_share_weight --sample --top_k 3 --ext top_3
```### Generating motions from texts (using test set)
```sh
python evaluate_t2m_seq2seq.py --name T2M_Seq2Seq_NML1_Ear_SME0_N --num_results 10 --repeat_times 3 --sample --ext sample
```
where *--repeat_time* gives how many sampling rounds are carried out for each description. This script will results in 3x10 animations under directory *./eval_results/t2m/T2M_Seq2Seq_NML1_Ear_SME0_N/sample/*.### Sampling results from customized descriptions
```sh
python gen_script_t2m_seq2seq.py --name T2M_Seq2Seq_NML1_Ear_SME0_N --repeat_times 3 --sample --ext customized --text_file ./input.txt
```
This will generate 3 animated motions for each description given in text_file *./input.txt*.If you find problem with installing ffmpeg, you may not be able to animate 3d results in mp4. Try gif instead.
## Quantitative Evaluations
### Evaluating Motion2Text
```sh
python final_evaluation_m2t.py
```
### Evaluating Motion2Text
```sh
python final_evaluation_t2m.py
```
This will evaluate the model performance on HumanML3D dataset by default. You could also run on KIT-ML dataset by uncommenting certain lines in *./final_evaluation.py*. The statistical results will saved to *./m2t(t2m)_evaluation.log*.### Misc
Contact Chuan Guo at cguo2@ualberta.ca for any questions or comments.