Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tqtg/hierarchical-attention-networks
TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"
https://github.com/tqtg/hierarchical-attention-networks
attention-mechanism document-classification hierarchical-attention-networks sentiment-analysis tensorflow text-classification
Last synced: about 1 month ago
JSON representation
TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"
- Host: GitHub
- URL: https://github.com/tqtg/hierarchical-attention-networks
- Owner: tqtg
- License: mit
- Created: 2018-11-30T03:50:04.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2019-05-28T04:14:14.000Z (over 5 years ago)
- Last Synced: 2024-08-01T17:33:08.677Z (4 months ago)
- Topics: attention-mechanism, document-classification, hierarchical-attention-networks, sentiment-analysis, tensorflow, text-classification
- Language: Python
- Homepage: https://www.cs.cmu.edu/~hovy/papers/16HLT-hierarchical-attention-networks.pdf
- Size: 1.07 MB
- Stars: 85
- Watchers: 7
- Forks: 25
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-tensorflow - Hierarchical Attention Networks - TensorFlow implementation of ["Hierarchical Attention Networks for Document Classification"](https://www.cs.cmu.edu/~hovy/papers/16HLT-hierarchical-attention-networks.pdf) (Models/Projects)
- awesome-tensorflow - Hierarchical Attention Networks - TensorFlow implementation of ["Hierarchical Attention Networks for Document Classification"](https://www.cs.cmu.edu/~hovy/papers/16HLT-hierarchical-attention-networks.pdf) (Models/Projects)
- fucking-awesome-tensorflow - Hierarchical Attention Networks - TensorFlow implementation of 🌎 ["Hierarchical Attention Networks for Document Classification"](www.cs.cmu.edu/~hovy/papers/16HLT-hierarchical-attention-networks.pdf) (Models/Projects)
README
# Hierarchical Attention Networks for Document Classification
This is an implementation of the paper [Hierarchical Attention Networks for Document Classification](https://www.cs.cmu.edu/~hovy/papers/16HLT-hierarchical-attention-networks.pdf), NAACL 2016.
![alt tag](img/model.png)
## Requirements
- Python 3
- Tensorflow > 1.0
- Pandas
- Nltk
- Tqdm
- [Glove pre-trained word embeddings](http://nlp.stanford.edu/data/glove.6B.zip)## Data
We use the [data](http://ir.hit.edu.cn/~dytang/paper/emnlp2015/emnlp-2015-data.7z) provided by [Tang et al. 2015](http://ir.hit.edu.cn/~dytang/paper/emnlp2015/emnlp2015.pdf), including 4 datasets:
- IMDB
- Yelp 2013
- Yelp 2014
- Yelp 2015**Note:**
The original data seems to have an [issue](https://github.com/tqtg/hierarchical-attention-networks/issues/1) with unzipping. I re-uploaded the [data](https://drive.google.com/file/d/1OQ_ggjlNUWiTg_zFXc0_OpYXpJRwJP3y) to GG Drive for better downloading speed. Please request for access permission.## Usage
First, download the [datasets](#data) and unzip into `data` folder.
Then, run script to prepare the data *(default is using Yelp-2015 dataset)*:```bash
python data_prepare.py
```Train and evaluate the model:
*(make sure [Glove embeddings](#requirements) are ready before training)*
```
wget http://nlp.stanford.edu/data/glove.6B.zip
unzip glove.6B.zip
```
```bash
python train.py
```Print training arguments:
```bash
python train.py --help
```
```
optional arguments:
-h, --help show this help message and exit
--cell_dim CELL_DIM
Hidden dimensions of GRU cells (default: 50)
--att_dim ATTENTION_DIM
Dimensionality of attention spaces (default: 100)
--emb_dim EMBEDDING_DIM
Dimensionality of word embedding (default: 200)
--learning_rate LEARNING_RATE
Learning rate (default: 0.0005)
--max_grad_norm MAX_GRAD_NORM
Maximum value of the global norm of the gradients for clipping (default: 5.0)
--dropout_rate DROPOUT_RATE
Probability of dropping neurons (default: 0.5)
--num_classes NUM_CLASSES
Number of classes (default: 5)
--num_checkpoints NUM_CHECKPOINTS
Number of checkpoints to store (default: 1)
--num_epochs NUM_EPOCHS
Number of training epochs (default: 20)
--batch_size BATCH_SIZE
Batch size (default: 64)
--display_step DISPLAY_STEP
Number of steps to display log into TensorBoard (default: 20)
--allow_soft_placement ALLOW_SOFT_PLACEMENT
Allow device soft device placement
```## Results
With the *Yelp-2015* dataset, after 5 epochs, we achieved:
- **69.79%** accuracy on the *dev set*
- **69.62%** accuracy on the *test set*No systematic hyper-parameter tunning was performed. The result reported in the paper is **71.0%** for the *Yelp-2015*.
![alt tag](img/train_log.png)