https://github.com/thunlp-mt/plm4mt

Code for our work "MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators" in ACL 2022
https://github.com/thunlp-mt/plm4mt

Last synced: over 1 year ago
JSON representation

Code for our work "MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators" in ACL 2022

Host: GitHub
URL: https://github.com/thunlp-mt/plm4mt
Owner: THUNLP-MT
License: bsd-3-clause
Created: 2022-03-17T08:16:42.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2022-03-18T13:01:23.000Z (over 4 years ago)
Last Synced: 2025-03-28T10:54:18.402Z (over 1 year ago)
Language: Python
Homepage:
Size: 46.9 KB
Stars: 20
Watchers: 5
Forks: 5
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # PLM4MT

This is the code for our ACL 2022 work [MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators](http://arxiv.org/abs/2110.06609). The implementation is on top of the open-source NMT toolkit [THUMT](https://github.com/THUNLP-MT/THUMT).

## Contents

* [Prerequisites](#prerequisites)

* [mGPT](#mgpt)

* [Format](#format)

* [Training](#training)

* [Decoding](#decoding)

* [Postprocessing](#postprocessing)

* [License](#license)

* [Citation](#citation)

## Prerequisites

* Python >= 3.7

* tensorflow-cpu >= 2.0

* torch >= 1.7

* transformers

Please read the document of [THUMT](https://github.com/THUNLP-MT/THUMT/blob/master/docs/index.md) before using this Repository.

## mGPT

You can download the mGPT checkpoint at [this url](https://huggingface.co/THUMT/mGPT).

## Format

We use `` to separate a source and a target sentence. For the WMT14 En-De dataset, the training file contains lines with the following format:

```

 Graphical artwork, corporate identity and corporate design.  Grafische Gestaltung, Layout, Corporate Identity und Corporate Design.

```

Here `` is a tag to indicate the source language, which can be omitted.

For inference, the test set contains lines like:

```

 Gutach: Increased safety for pedestrians 

```

## Training

Using the following command to train a prompt for translation:

```[bash]

CODES=

CKPT=

export PYTHONPATH=$CODES:$PYTHONPATH

export USE_TF=0

export USE_TORCH=1

python $CODES/thumt/bin/trainer.py \

    --half \

    --input  \

    --model  \

    --ptm $CKPT \

    --parameters=device_list=[0,1,2,3,4,5,6,7],\

                 train_steps=40000,update_cycle=16,batch_size=256,\

                 save_checkpoint_steps=2000,max_length=256 \

    --hparam_set base

```

Here `model_name` has the following three options:

* `mgpt_prompt`: mGPT with Prompt tuning

* `mgpt_prefix`: mGPT with Prefix-tuning

* `mgpt_msp`: mGPT with multi-stage prompting

## Decoding

The following command decodes an input file:

```

CODES=

export PYTHONPATH=:$PYTHONPATH

python $CODES/thumt/bin/translator.py \

  --input  \

  --ptm  \

  --output  \

  --model  \

  --half --prefix  \

  --parameters=device_list=[0,1,2,3],\

               decode_alpha=0.0,\

               decode_batch_size=4,\

               prompt_length=128

```

## Postprocessing

We use `tools/punc.cpp` to replace punctuations for Chinese. Use the following command to compile the code:

```[bash]

g++ -std=c++11 -o punc tools/punc.cpp

```

Then use the following command to replace punctuations

```[bash]

cat  | ./punc | 

```

## License

Open source licensing is under the [BSD-3-Clause](https://opensource.org/licenses/BSD-3-Clause), which allows free use for research purposes.

## Citation

```

@article{tan2021msp,

  title={{MSP}: Multi-stage prompting for making pre-trained language models better translators},

  author={Tan, Zhixing and Zhang, Xiangwen and Wang, Shuo and Liu, Yang},

  journal={arXiv preprint arXiv:2110.06609},

  year={2021}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/thunlp-mt/plm4mt

Awesome Lists containing this project

README