Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ivanbongiorni/attention-chatbot

Chatbot for Twitter Customer Support. A Seq2seq Neural Network with Multiplicative Attention mechanism implemented in TensorFlow 2.
https://github.com/ivanbongiorni/attention-chatbot

attention attention-mechanism chatbot deeplearning machine-learning rnn seq2seq tensorflow tensorflow2 twitter

Last synced: 4 months ago
JSON representation

Chatbot for Twitter Customer Support. A Seq2seq Neural Network with Multiplicative Attention mechanism implemented in TensorFlow 2.

Host: GitHub
URL: https://github.com/ivanbongiorni/attention-chatbot
Owner: IvanBongiorni
Created: 2020-03-08T15:11:41.000Z (almost 5 years ago)
Default Branch: master
Last Pushed: 2021-12-29T15:27:18.000Z (about 3 years ago)
Last Synced: 2024-10-10T17:23:34.309Z (4 months ago)
Topics: attention, attention-mechanism, chatbot, deeplearning, machine-learning, rnn, seq2seq, tensorflow, tensorflow2, twitter
Language: Python
Homepage:
Size: 10 MB
Stars: 5
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# WARNING: WORK IN PROGRESS
This repository is not ready yet. Please don't clone or use right now. A functioning prototype should be ready in the next weeks.

# Attention-based Chatbot for Twitter Customer Support.

This is a Chatbot for Twitter Customer Support.
It is implemented as a **Seq2seq RNN with Multiplicative Attention** mechanism.

Data have been taken from Kaggle's [Customer Support on Twitter](https://www.kaggle.com/thoughtvector/customer-support-on-twitter) dataset.
This dataset comprehends tweet exchanges from multiple companies.
Each company requires a different model implementation.

## How it works

The model implemented is a **Seq2Seq** Neural Network with **LSTM** layers and **Luong's Multiplicative Attention**.
It is written in **TensorFlow 2.1** and optimized with **Autograph**.
The model is character-based, i.e. single characters are tokenized and predicted.
Training is implemented with **teacher forcing**.

## Structure of the Repository

Folders:
- `/data_raw`: uncompressed raw dataset must be pasted here.
- `/data_processed`: all pre-processed observations will be saved in `/Training`, `/Validation` and `/Test` sub-folders.
It contains also `.yaml` dictionaries to translate from token (character) to vector; their naming convention is `char2idx_{company}.yaml`.
- `/dataprep`: contains all data preprocessing scripts, with names as `dataprep_{company}.py`.
- `/tools`: useful functions to be iterated in dataprep.
One main `tools.py` module contains functions used for all models.
For more company-specific tools other modules are available as `tools_{company}.py`.
- `/saved_models`: where trained models are saved and/or loaded after launching `train.py`.
- `/talk`: contains a list of scripts to be called from terminal to chat with a trained model. Naming convention is still `talk_{company}.py`.

Files:
- `config.yaml`: main configuration file. Every hyperparameter and model choice can be decided here.
- `model.py`: model implementation.
- `train.py`: model training. The company's model and the data to be trained on can be chosed from `config.yaml`.

## Modules
```
langdetect==1.0.8
tensorflow==2.1.0
numpy==1.18.1
pandas==1.0.1
```

## Bibliography

Papers:
- *Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. "Effective approaches to attention-based neural machine translation." arXiv preprint arXiv:1508.04025 (2015).*

Other useful resources:
- The official TensorFlow tutorial on a [Neural machine translation with attention](https://www.tensorflow.org/tutorials/text/nmt_with_attention).
- [Attention-based Sequence-to-Sequence in Keras](https://wanasit.github.io/attention-based-sequence-to-sequence-in-keras.html) on Wanasit Tanakitrungruang's [blog](https://wanasit.github.io/).