https://github.com/soskek/convolutional_seq2seq
fairseq: Convolutional Sequence to Sequence Learning (Gehring et al. 2017) by Chainer
https://github.com/soskek/convolutional_seq2seq
chainer deep-learning deep-neural-networks machine-learning seq2seq
Last synced: 18 days ago
JSON representation
fairseq: Convolutional Sequence to Sequence Learning (Gehring et al. 2017) by Chainer
- Host: GitHub
- URL: https://github.com/soskek/convolutional_seq2seq
- Owner: soskek
- License: bsd-3-clause
- Created: 2017-05-30T12:48:18.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2017-06-15T23:16:42.000Z (almost 8 years ago)
- Last Synced: 2025-03-28T20:21:17.523Z (about 1 month ago)
- Topics: chainer, deep-learning, deep-neural-networks, machine-learning, seq2seq
- Language: Python
- Homepage:
- Size: 27.3 KB
- Stars: 67
- Watchers: 5
- Forks: 16
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Convolutional Sequence to Sequence Learning
[Chainer](https://github.com/chainer/chainer/)-based Python implementation of a convolutional seq2seq model.This is derived from Chainer's official [seq2seq example](https://github.com/chainer/chainer/tree/seq2seq-europal/examples/seq2seq).
See [Convolutional Sequence to Sequence Learning](https://arxiv.org/abs/1705.03122), Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann N. Dauphin, arxiv, 2017. [blog post](https://code.facebook.com/posts/1978007565818999/a-%20novel-approach-to-neural-machine-translation/), [Torch code](https://github.com/facebookresearch/fairseq).
## Requirement
- Python 3.6.0+
- [Chainer](https://github.com/chainer/chainer/) 2.0.0+ (this version is strictly required)
- [numpy](https://github.com/numpy/numpy) 1.12.1+
- [cupy](https://github.com/cupy/cupy) 1.0.0+ (if using gpu)
- and their dependencies## Prepare Dataset
You can use any parallel corpus.
For example, run `download_wmt.sh` which downloads and decompresses [training dataset](http://www.statmt.org/europarl/v7/fr-en.tgz) and [development dataset](http://www.statmt.org/wmt15/dev-v2.tgz) from [WMT](http://www.statmt.org/wmt15/translation-task.html#download)/[europal](http://www.statmt.org/europarl/) into your current directory. These files and their paths are set in training script `seq2seq.py` as default.## How to Run
```
PYTHONIOENCODING=utf-8 python -u seq2seq.py -g=0 -i DATA_DIR -o SAVE_DIR
```During training, logs for loss, perplexity, word accuracy and time are printed at a certain internval, in addition to validation tests (perplexity and BLEU for generation) every half epoch. And also, generation test is performed and printed for checking training progress.
### Arguments
- `-g`: your gpu id. If cpu, set `-1`.
- `-i DATA_DIR`, `-s SOURCE`, `-t TARGET`, `-svalid SVALID`, `-tvalid TVALID`:
`DATA_DIR` directory needs to include a pair of training dataset `SOURCE` and `TARGET` with a pair of validation dataset `SVALID` and `TVALID`. Each pair should be parallell corpus with line-by-line sentence alignment.
- `-o SAVE_DIR`: JSON log report file and a model snapshot will be saved in `SAVE_DIR` directory (if it does not exist, it will be automatically made).
- `-e`: max epochs of training corpus.
- `-b`: minibatch size.
- `-u`: size of units and word embeddings.
- `-l`: number of layers in both the encoder and the decoder.
- `--source-vocab`: max size of vocabulary set of source language
- `--target-vocab`: max size of vocabulary set of target language