https://github.com/tran-khoa/joint-training-cascaded-st

Code for the paper "Does Joint Training Really Help Cascaded Speech Translation?" (EMNLP 2022)
https://github.com/tran-khoa/joint-training-cascaded-st

emnlp2022 fairseq nlp speech-translation

Last synced: about 2 months ago
JSON representation

Code for the paper "Does Joint Training Really Help Cascaded Speech Translation?" (EMNLP 2022)

Host: GitHub
URL: https://github.com/tran-khoa/joint-training-cascaded-st
Owner: tran-khoa
License: mit
Created: 2022-10-07T11:49:11.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2022-10-26T09:11:09.000Z (over 2 years ago)
Last Synced: 2025-04-24T01:47:45.051Z (about 2 months ago)
Topics: emnlp2022, fairseq, nlp, speech-translation
Language: Python
Homepage:
Size: 5.22 MB
Stars: 4
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

README

        # Does Joint Training Really Help Cascaded Speech Translation?

This repository contains code for the paper "Does Joint Training Really Help Cascaded Speech Translation?" ([arXiv](https://arxiv.org/abs/2210.13700)) in [EMNLP 2022](https://2022.emnlp.org/),

based on [fairseq](https://github.com/facebookresearch/fairseq).

## Cite This Work

To cite this work, please use the following .bib:

```

@InProceedings{tran22:joint_training_cascaded_speech_translation,

	author={Tran, Viet Anh Khoa and Thulke, David and Gao, Yingbo and Herold, Christian and Ney, Hermann},  	

	title={Does Joint Training Really Help Cascaded Speech Translation?},  

	booktitle={Conference on Empirical Methods in Natural Language Processing},

	year=2022,  

	address={Abu Dhabi, United Arab Emirates},  

	month=nov,  

	booktitlelink={https://2022.emnlp.org/},

}

```

## Requirements and Installation (adapted from fairseq)

* [PyTorch](http://pytorch.org/) version 1.7.1

* torchaudio 0.7.2

* Python version >= 3.7

* **To install fairseq** and develop locally:

``` bash

git clone https://github.com/tran-khoa/joint-training-cascaded-st

cd joint-training-cascaded-st

pip install --editable ./

cd projects/speech_translation

pip install -r requirements.txt

# on MacOS:

# CFLAGS="-stdlib=libc++" pip install --editable ./

```

* **For faster training** install NVIDIA's [apex](https://github.com/NVIDIA/apex) library:

``` bash

git clone https://github.com/NVIDIA/apex

cd apex

pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" \

  --global-option="--deprecated_fused_adam" --global-option="--xentropy" \

  --global-option="--fast_multihead_attn" ./

```

# Running experiments

The implementation is located in `projects/speech_translation`. 

Please refer to the scripts in `projects/speech_translation/experiments`.

The term `joint-seq` refers to `Top-K-Train` in the paper, `tight` refers to 'Tight-Integration' as introduced in [Tight integrated end-to-end training for cascaded speech translation](https://ieeexplore.ieee.org/abstract/document/9383462).

# License (adapted from fairseq)

fairseq(-py) is MIT-licensed.

The license applies to the pre-trained models as well.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tran-khoa/joint-training-cascaded-st

Awesome Lists containing this project

README