https://github.com/declare-lab/mm-align
[EMNLP 2022] This repository contains the official implementation of the paper "MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences"
https://github.com/declare-lab/mm-align
machine-learning multimodal-deep-learning multimodal-sentiment-analysis natural-language-processing optimal-transport
Last synced: about 1 month ago
JSON representation
[EMNLP 2022] This repository contains the official implementation of the paper "MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences"
- Host: GitHub
- URL: https://github.com/declare-lab/mm-align
- Owner: declare-lab
- License: mit
- Created: 2022-10-23T16:30:19.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-03-10T07:21:05.000Z (about 1 year ago)
- Last Synced: 2025-03-27T18:21:40.582Z (about 2 months ago)
- Topics: machine-learning, multimodal-deep-learning, multimodal-sentiment-analysis, natural-language-processing, optimal-transport
- Language: Python
- Homepage:
- Size: 284 KB
- Stars: 28
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences
This repository contains the official implementation of the paper: [MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences](https://arxiv.org/pdf/2210.12798v1.pdf), published at EMNLP 2022.
![]()
## Setup
### Conda Environemnt
```bash
conda env create -f environment.yml
conda activate mmalign
python -m spacy download en_core_web_sm
```### CMU-MOSI and CMU-MOSEI
Please refer to [this repository](https://github.com/declare-lab/BBFN) to get the `.pkl` files that store the extracted features (by CMU-MMSDK with integrated COVAREP and P2FA) of the two datasets.### MELD dataset
You can download the processed dataset (`.pkl`) from [here](https://drive.google.com/file/d/1RjrYSMpXxg_6r_nUQaysaPyMsldLpMcb/view?usp=sharing).
Alternatively, if you'd like to extract the features by yourself, you can download the raw dataset from [here](http://web.eecs.umich.edu/~mihalcea/downloads/MELD.Raw.tar.gz). Then you can extract the visual and audio features with [ResNet101](https://github.com/v-iashin/video_features) (FPS=25) and [Wave2Vec2.0](https://huggingface.co/docs/transformers/model_doc/wav2vec2). Additionally, you need to manually gather text and extracted feature vectors by their IDs and split them into `(train/dev/test).pkl` files.Next, split the processed dataset into complete/incomplete partitions using `scripts/split_dataset.py`
```bash
python split_dataset.py --data_path --seed --group_id --complete_ratio --split
```
We provide an example script `script/run_split.sh`, which automatically generates 5 different partitions for a given dataset under the seed 2020-2024.## Train and Test
```bash
cd src
python main.py --dataset --data_path --group_id --modals --save_name
```The best test results are automatically saved under `results/_.tsv`
## Citation
Please cite our paper if you find that useful for your research:
```bibtex
@inproceedings{han2022mmalign,
title={MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences},
author={Han, Wei and Chen, Hui and Kan Min-Yen and Poria, Soujanya},
booktitle={Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing},
year={2022}
}
```## Contact
Should you have any question, feel free to contact me through [[email protected]]([email protected])