Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mlpnlp/mlpnlp-nmt

This is a sample code of "LSTM encoder-decoder with attention mechanism" mainly for understanding a recently developed machine translation framework based on deep neural networks.
https://github.com/mlpnlp/mlpnlp-nmt

Last synced: 9 days ago
JSON representation

This is a sample code of "LSTM encoder-decoder with attention mechanism" mainly for understanding a recently developed machine translation framework based on deep neural networks.

Host: GitHub
URL: https://github.com/mlpnlp/mlpnlp-nmt
Owner: mlpnlp
License: other
Created: 2017-08-11T23:56:36.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2019-03-14T15:10:38.000Z (over 5 years ago)
Last Synced: 2024-08-02T12:21:52.066Z (3 months ago)
Language: Python
Homepage:
Size: 1.12 MB
Stars: 42
Watchers: 5
Forks: 16
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # mlpnlp-nmt

This is a sample code of "LSTM encoder-decoder with attention mechanism" mainly for understanding a recently developed machine translation framework based on deep neural networks.

# How to use

Please see the following example.

## Requirement

* Assumes "chainer" installed!! for installation, see https://chainer.org

## Data 

* Sample data from WMT16 page http://www.statmt.org/wmt16/translation-task.html

## Preprocess (vocab file) examples

```bash

for f in sample_data/newstest2012-4p.{en,de} ;do \

    echo ${f} ; \

    cat ${f} | sed '/^$/d' | perl -pe 's/^\s+//; s/\s+\n$/\n/; s/ +/\n/g'  | \

    LC_ALL=C sort | LC_ALL=C uniq -c | LC_ALL=C sort -r -g -k1 | \

    perl -pe 's/^\s+//; ($a1,$a2)=split;

       if( $a1 >= 3 ){ $_="$a2\t$a1\n" }else{ $_="" } ' > ${f}.vocab_t3_tab ;\

done

```

or

```bash

T=3; for f in sample_data/newstest2012-4p.{en,de} ;do \

    echo ${f} ; \

    cat ${f} | python count_freq.py ${T} > ${f}.vocab_t${T}_tab ; \

done

```

## Training 

Note that please run with ``GPU=-1`` option for no GPU environment

```bash

SLAN=de; TLAN=en; GPU=0;  EP=13 ;  \

MODEL=filename_of_sample_model.model ; \

python -u ./LSTMEncDecAttn.py -V2 \

   -T                      train \

   --gpu-enc               ${GPU} \

   --gpu-dec               ${GPU} \

   --enc-vocab-file        sample_data/newstest2012-4p.${SLAN}.vocab_t3_tab \

   --dec-vocab-file        sample_data/newstest2012-4p.${TLAN}.vocab_t3_tab \

   --enc-data-file         sample_data/newstest2012-4p.${SLAN} \

   --dec-data-file         sample_data/newstest2012-4p.${TLAN} \

   --enc-devel-data-file   sample_data/newstest2015.h100.${SLAN} \

   --dec-devel-data-file   sample_data/newstest2015.h100.${TLAN} \

   -D                          512 \

   -H                          512 \

   -N                          2 \

   --optimizer                 SGD \

   --lrate                     1.0 \

   --batch-size                32 \

   --out-each                  0 \

   --epoch                     ${EP} \

   --eval-accuracy             0 \

   --dropout-rate              0.3 \

   --attention-mode            1 \

   --gradient-clipping         5 \

   --initializer-scale         0.1 \

   --initializer-type          uniform \

   --merge-encoder-fwbw        0 \

   --use-encoder-bos-eos       0 \

   --use-decoder-inputfeed     1 \

   -O                          ${MODEL} \

```

## Evaluation

```bash

SLAN=de; GPU=0;  EP=13 ; BEAM=5 ;  \

MODEL=filename_of_sample_model.model ; \

python -u ./LSTMEncDecAttn.py \

   -T                  test \

   --gpu-enc           ${GPU} \

   --gpu-dec           ${GPU} \

   --enc-data-file     sample_data/newstest2015.h101-200.${SLAN} \

   --init-model        ${MODEL}.epoch${EP} \

   --setting           ${MODEL}.setting    \

   --beam-size         ${BEAM} \

   --max-length        150 \

   > ${MODEL}.epoch${EP}.decode_MAX${MAXLEN}_BEAM${BEAM}.txt

```