An open API service indexing awesome lists of open source software.

https://github.com/ashly1991/transformer-nmt-tf2

Transformer neural machine translation in TensorFlow 2 with tensorflow-text; tutorial-based build with masks, positional encodings, and experiments.
https://github.com/ashly1991/transformer-nmt-tf2

attention encoder-decoder jupyter-notebook neural-machine-translation nmt tensorflow tensorflow-text transformer

Last synced: 20 days ago
JSON representation

Transformer neural machine translation in TensorFlow 2 with tensorflow-text; tutorial-based build with masks, positional encodings, and experiments.

Awesome Lists containing this project

README

          

# Transformer Neural Machine Translation (TensorFlow 2)

Self-contained implementation of a **Transformer** for **neural machine translation** in TensorFlow 2. The project covers tokenization, positional encodings, masking (padding & look-ahead), multi-head self-attention, encoder–decoder stacks, training, and inference/decoding.

## Highlights
- **Tokenization & vocab** (with `tensorflow-text`) and **positional encodings**.
- **Masks**: padding mask for loss/attention; look-ahead mask for the decoder.
- **Transformer blocks**: scaled dot-product attention, **multi-head attention**, FFN, residuals + layer norm.
- **Training loop** with cross-entropy + accuracy; masked loss to ignore padding.
- **Inference** (greedy by default; extendable to beam search).
- **Reproducibility**: seeds set in the notebook; notes on deterministic decoding.

## What I learned (from this build)
- How a Transformer uses self-attention, multi-head attention, and positional encodings to model sequences.
- Why positional encodings are needed (attention is permutation-invariant).
- How masks (padding & look-ahead) affect attention and the loss.
- Encoder–decoder structure; **teacher forcing** at training vs **auto-regressive** decoding at inference.
- Practical setup (tokenization, vocabularies, training loop, decoding).

## How to run
```bash
python -m venv .venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
jupyter lab transformer-nmt.ipynb
```

## Requirements
Pinned for stability/performance with this implementation:
```
tensorflow==2.14.0
tensorflow-text==2.14.0
tensorflow-datasets
numpy
matplotlib
jupyterlab
```

## Notes
- Greedy decoding is deterministic with fixed weights and dropout disabled.
To ensure repeatable translations, set seeds and avoid sampling at inference.
- Swap tokenizers/vocabs + final projection size to use a different language pair; core architecture stays the same.

## License
MIT — see `LICENSE`.