Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ivanbongiorni/attention-chatbot
Chatbot for Twitter Customer Support. A Seq2seq Neural Network with Multiplicative Attention mechanism implemented in TensorFlow 2.
https://github.com/ivanbongiorni/attention-chatbot
attention attention-mechanism chatbot deeplearning machine-learning rnn seq2seq tensorflow tensorflow2 twitter
Last synced: 3 months ago
JSON representation
Chatbot for Twitter Customer Support. A Seq2seq Neural Network with Multiplicative Attention mechanism implemented in TensorFlow 2.
- Host: GitHub
- URL: https://github.com/ivanbongiorni/attention-chatbot
- Owner: IvanBongiorni
- Created: 2020-03-08T15:11:41.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2021-12-29T15:27:18.000Z (about 3 years ago)
- Last Synced: 2024-10-10T17:23:34.309Z (3 months ago)
- Topics: attention, attention-mechanism, chatbot, deeplearning, machine-learning, rnn, seq2seq, tensorflow, tensorflow2, twitter
- Language: Python
- Homepage:
- Size: 10 MB
- Stars: 5
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# WARNING: WORK IN PROGRESS
This repository is not ready yet. Please don't clone or use right now. A functioning prototype should be ready in the next weeks.# Attention-based Chatbot for Twitter Customer Support.
This is a Chatbot for Twitter Customer Support.
It is implemented as a **Seq2seq RNN with Multiplicative Attention** mechanism.Data have been taken from Kaggle's [Customer Support on Twitter](https://www.kaggle.com/thoughtvector/customer-support-on-twitter) dataset.
This dataset comprehends tweet exchanges from multiple companies.
Each company requires a different model implementation.## How it works
The model implemented is a **Seq2Seq** Neural Network with **LSTM** layers and **Luong's Multiplicative Attention**.
It is written in **TensorFlow 2.1** and optimized with **Autograph**.
The model is character-based, i.e. single characters are tokenized and predicted.
Training is implemented with **teacher forcing**.## Structure of the Repository
Folders:
- `/data_raw`: uncompressed raw dataset must be pasted here.
- `/data_processed`: all pre-processed observations will be saved in `/Training`, `/Validation` and `/Test` sub-folders.
It contains also `.yaml` dictionaries to translate from token (character) to vector; their naming convention is `char2idx_{company}.yaml`.
- `/dataprep`: contains all data preprocessing scripts, with names as `dataprep_{company}.py`.
- `/tools`: useful functions to be iterated in dataprep.
One main `tools.py` module contains functions used for all models.
For more company-specific tools other modules are available as `tools_{company}.py`.
- `/saved_models`: where trained models are saved and/or loaded after launching `train.py`.
- `/talk`: contains a list of scripts to be called from terminal to chat with a trained model. Naming convention is still `talk_{company}.py`.Files:
- `config.yaml`: main configuration file. Every hyperparameter and model choice can be decided here.
- `model.py`: model implementation.
- `train.py`: model training. The company's model and the data to be trained on can be chosed from `config.yaml`.## Modules
```
langdetect==1.0.8
tensorflow==2.1.0
numpy==1.18.1
pandas==1.0.1
```## Bibliography
Papers:
- *Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. "Effective approaches to attention-based neural machine translation." arXiv preprint arXiv:1508.04025 (2015).*Other useful resources:
- The official TensorFlow tutorial on a [Neural machine translation with attention](https://www.tensorflow.org/tutorials/text/nmt_with_attention).
- [Attention-based Sequence-to-Sequence in Keras](https://wanasit.github.io/attention-based-sequence-to-sequence-in-keras.html) on Wanasit Tanakitrungruang's [blog](https://wanasit.github.io/).