Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/line/ChronAccRet
https://github.com/line/ChronAccRet
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/line/ChronAccRet
- Owner: line
- License: apache-2.0
- Created: 2024-08-05T01:18:03.000Z (5 months ago)
- Default Branch: master
- Last Pushed: 2024-08-05T14:29:22.000Z (5 months ago)
- Last Synced: 2024-08-06T07:32:12.867Z (5 months ago)
- Language: Python
- Size: 153 KB
- Stars: 0
- Watchers: 5
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# Official Implementation of Chronologically Accurate Retrieval for Temporal Grounding of Motion-Language Models (ECCV 2024)
Official implementation of the paper: Chronologically Accurate Retrieval for Temporal Grounding of Motion-Language Models.
## Framework
![Framework](./assets/figure.jpg)
## Requirements
Python 3.10
Pytorch 2.0.0
### Recommended: Using Poetry
```bash
poetry install
```### Using Conda
```bash
conda create -n car python=3.11
conda activate car
conda install pytorch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt
```## Data Preparation
Please follow the descriptions in the [TMR](https://github.com/Mathux/TMR) repository to prepare the datasets.
Then, place them as follows.
Note that events have to be generated as described in the paper. Please prepare a json file for each motion, with "events" indicating the decomposed events. You can download the event descriptions from [here](https://drive.google.com/file/d/14FUVBotloBLDbtpvlShQTyX5A4TogLL4/view?usp=drive_link).
Please view "datasets/datasets.py" for how the files are handled.```
ChronAccRet
│ README.md
│ requirements.txt
| ...
|
└───config
└───datasets
└───...
└───stats
└───humanml3d
└───guoh3dfeats #place the statistics here
└───mean.pt
std.pt
└───data
└───humanml3d
│ annotations.json
└───event_texts #texts for events (the json files described above)
└───motions
└───sent_embeddings #embeddings of sentences
└───splits
└───motions
└─── guoh3dfeats #processed pose data
└─── pose_data
```
## Training
Modify the config file in configs folder. The default is set to train the model with DistilBERT.
Change the line before the main() function accordingly, then train:```bash
python train.py
```We prepared four variations of our method based on DistilBert: Original TMR in ''event > event'' (train_bert_orig.yaml) and ''orig > event'' (train_bert_orig_o2e.yaml), our version in ''event > event'' (train_bert_neg.yaml) and ''orig > event'' (train_bert_neg_o2e.yaml). The pretrained weights can be downloaded from [here](https://huggingface.co/line-corporation/ChronAccRet/). The trained models should be placed in ```output``` directory. Please use the config file as guideline to train other variants. To tune the language model, set the `train_text_encoder` option to True.
## Evaluation
Change the line before the main() function accordingly, then evaluate using conventional metrics:
```bash
python evaluate.py
```For CAR accuracy:
```bash
python retrieval_shuffle.py
```For motion-to-text retrieval with the shuffled texts:
```bash
python evaluate_negative.py
```## Acknowledgement
Some parts of our code are based on [TMR](https://github.com/Mathux/TMR).## Citation
```bibtex
@InProceedings{Fujiwara_2024_ECCV,
author = {Kent Fujiwara and Mikihiro Tanaka and Qing Yu},
title = {Chronologically Accurate Retrieval for Temporal Grounding of Motion-Language Models},
booktitle = {Proc. of the European Conf. on Computer Vision (ECCV)},
year = {2024},
}
```## License
[Apache License 2.0](LICENSE)Additionally, this repository contains third-party software. Refer [NOTICE.txt](NOTICE.txt) for more details and follow the terms and conditions of their use.