https://github.com/aveek-saha/transformer
A TensorFlow 2.0 Implementation of the Transformer: Attention Is All You Need
https://github.com/aveek-saha/transformer
attention-is-all-you-need attention-mechanism attention-network keras tensorflow2 transformer
Last synced: 6 months ago
JSON representation
A TensorFlow 2.0 Implementation of the Transformer: Attention Is All You Need
- Host: GitHub
- URL: https://github.com/aveek-saha/transformer
- Owner: Aveek-Saha
- License: apache-2.0
- Created: 2020-08-15T05:40:52.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-08-25T07:36:21.000Z (about 5 years ago)
- Last Synced: 2025-02-15T20:24:27.379Z (8 months ago)
- Topics: attention-is-all-you-need, attention-mechanism, attention-network, keras, tensorflow2, transformer
- Language: Python
- Homepage:
- Size: 39.1 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Transformer
A TensorFlow 2.x implementation of the Transformer from [`Attention Is All You Need`](https://arxiv.org/pdf/1706.03762.pdf) (Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, arxiv, 2017).
This is my attempt at trying to understand and recreate the transformer from the research paper. This is just for my own understanding of the subject and is by no means perfect.In order to understand and implement the transformer I've taken the help of various tutorials and code guides, which I'll be linking in the resources section.
## Requirements
- tensorflow==2.1.0
- numpy==1.16.5
- tensorflow_datasets==3.2.1## How to run
`python train.py`
## Resources
- The original paper: https://arxiv.org/pdf/1706.03762.pdf
- Input and training pipeline: https://www.tensorflow.org/tutorials/text/transformer
- An useful article explaining the paper: https://mlexplained.com/2017/12/29/attention-is-all-you-need-explained/
- Another useful article explaining the paper: http://jalammar.github.io/illustrated-transformer/
- A tensorflow 1.x transformer implementation: https://github.com/Kyubyong/transformer/
- The official implementation in tensorflow: https://github.com/tensorflow/tensor2tensor