{"id":13614667,"url":"https://github.com/thushv89/attention_keras","last_synced_at":"2025-04-13T18:33:38.710Z","repository":{"id":37601640,"uuid":"176034307","full_name":"thushv89/attention_keras","owner":"thushv89","description":"Keras Layer implementation of Attention for Sequential models","archived":false,"fork":false,"pushed_at":"2023-03-25T01:43:28.000Z","size":338,"stargazers_count":441,"open_issues_count":11,"forks_count":267,"subscribers_count":12,"default_branch":"master","last_synced_at":"2024-08-02T20:46:26.423Z","etag":null,"topics":["deep-learning","keras","lstm","rnn","tensorflow"],"latest_commit_sha":null,"homepage":"https://towardsdatascience.com/light-on-math-ml-attention-with-keras-dc8dbc1fad39","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thushv89.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"license.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-03-16T23:07:20.000Z","updated_at":"2024-04-16T05:12:18.000Z","dependencies_parsed_at":"2023-01-21T12:49:11.422Z","dependency_job_id":null,"html_url":"https://github.com/thushv89/attention_keras","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thushv89%2Fattention_keras","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thushv89%2Fattention_keras/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thushv89%2Fattention_keras/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thushv89%2Fattention_keras/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thushv89","download_url":"https://codeload.github.com/thushv89/attention_keras/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223600419,"owners_count":17171665,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","keras","lstm","rnn","tensorflow"],"created_at":"2024-08-01T20:01:04.272Z","updated_at":"2024-11-07T22:31:56.313Z","avatar_url":"https://github.com/thushv89.png","language":"Python","funding_links":["https://www.buymeacoffee.com/thushv89"],"categories":["Attention","Python"],"sub_categories":["Repositories"],"readme":"# TensorFlow (Keras) Attention Layer for RNN based models\n\n![![Build Status (CircleCI)](https://circleci.com/gh/circleci/circleci-docs.svg?style=sheild)](https://img.shields.io/circleci/build/gh/thushv89/attention_keras)\n\n## Version (s)\n- TensorFlow: 2.9.1 (Tested)\n- TensorFlow: 1.15.0 (Soon to be deprecated)\n\n## Introduction\n\nThis is an implementation of Attention (only supports [Bahdanau Attention](https://arxiv.org/pdf/1409.0473.pdf) right now)\n\n## Project structure\n\n```\ndata (Download data and place it here)\n |--- small_vocab_en.txt\n |--- small_vocab_fr.txt\nsrc\n |--- layers\n       |--- attention.py (Attention implementation)\n |--- examples\n       |--- nmt\n             |--- model.py (NMT model defined with Attention)\n             |--- train.py ( Code for training/inferring/plotting attention with NMT model)\n       |--- nmt_bidirectional\n             |--- model.py (NMT birectional model defined with Attention)\n             |--- train.py ( Code for training/inferring/plotting attention with NMT model)\n\n```\n## How to use\n\nJust like you would use any other `tensoflow.python.keras.layers` object.\n\n```python\nfrom attention_keras.src.layers.attention import AttentionLayer\n\nattn_layer = AttentionLayer(name='attention_layer')\nattn_out, attn_states = attn_layer([encoder_outputs, decoder_outputs])\n\n```\n\nHere,\n\n- `encoder_outputs` - Sequence of encoder ouptputs returned by the RNN/LSTM/GRU (i.e. with `return_sequences=True`)\n- `decoder_outputs` - The above for the decoder\n- `attn_out` - Output context vector sequence for the decoder. This is to be concat with the output of decoder (refer `model/nmt.py` for more details)\n- `attn_states` - Energy values if you like to generate the heat map of attention (refer `model.train_nmt.py` for usage)\n\n## Visualizing Attention weights\n\nAn example of attention weights can be seen in `model.train_nmt.py`\n\nAfter the model trained attention result should look like below.\n\n![Attention heatmap](https://github.com/thushv89/attention_keras/blob/master/results/attention.png)\n\n## Running the NMT example\n\n### Prerequisites\n* In order to run the example you need to download `small_vocab_en.txt` and `small_vocab_fr.txt` from [Udacity deep learning repository](https://github.com/udacity/deep-learning/tree/master/language-translation/data) and place them in the `data` folder.\n\n### Using the docker image\n* If you would like to run this in the docker environment, simply running `run.sh` will take you inside the docker container.\n* E.g. usage `run.sh -v \u003cTF_VERSION\u003e [-g]`\n  * `-v` specifies the TensorFlow version (defaults to `latest`)\n  * `-g` if specified use the GPU compatible Docker image\n\n### Using a virtual environment\n* If you would like to use a virtual environment, first create and activate the virtual environment. \n* Then, use either \n  * `pip install -r requirements.txt -r requirements_tf_cpu.txt` (For CPU)\n  * `pip install -r requirements.txt -r requirements_tf_gpu.txt` (For GPU)\n    \n### Running the code\n* Go to the \u003cproject dir\u003e. Any example you run, you should run from the \u003cproject dir\u003e folder (the main folder). Otherwise, you will run into problems with finding/writing data.\n* Run  `python3 src/examples/nmt/train.py`. Set `degug=True` if you need to run simple and faster.\n* If run successfully, you should have models saved in the model dir and `attention.png` in the `results` dir.\n\n## If you would like to show support\n\nIf you'd like to show your appreciation you can [buy me a coffee](https://www.buymeacoffee.com/thushv89). No stress! It's totally optional. The support I recieved would definitely an added benefit to maintain the repository and continue on my other contributions. \n\n___\n\nIf you have improvements (e.g. other attention mechanisms), contributions are welcome!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthushv89%2Fattention_keras","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthushv89%2Fattention_keras","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthushv89%2Fattention_keras/lists"}