https://github.com/tqtg/hierarchical-attention-networks

TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"
https://github.com/tqtg/hierarchical-attention-networks

attention-mechanism document-classification hierarchical-attention-networks sentiment-analysis tensorflow text-classification

Last synced: 3 months ago
JSON representation

TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"

Host: GitHub
URL: https://github.com/tqtg/hierarchical-attention-networks
Owner: tqtg
License: mit
Created: 2018-11-30T03:50:04.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2019-05-28T04:14:14.000Z (about 6 years ago)
Last Synced: 2024-10-30T16:41:22.343Z (8 months ago)
Topics: attention-mechanism, document-classification, hierarchical-attention-networks, sentiment-analysis, tensorflow, text-classification
Language: Python
Homepage: https://www.cs.cmu.edu/~hovy/papers/16HLT-hierarchical-attention-networks.pdf
Size: 1.07 MB
Stars: 86
Watchers: 7
Forks: 25
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-tensorflow - Hierarchical Attention Networks - TensorFlow implementation of ["Hierarchical Attention Networks for Document Classification"](https://www.cs.cmu.edu/~hovy/papers/16HLT-hierarchical-attention-networks.pdf) (Models/Projects)
awesome-tensorflow - Hierarchical Attention Networks - TensorFlow implementation of ["Hierarchical Attention Networks for Document Classification"](https://www.cs.cmu.edu/~hovy/papers/16HLT-hierarchical-attention-networks.pdf) (Models/Projects)
fucking-awesome-tensorflow - Hierarchical Attention Networks - TensorFlow implementation of 🌎 ["Hierarchical Attention Networks for Document Classification"](www.cs.cmu.edu/~hovy/papers/16HLT-hierarchical-attention-networks.pdf) (Models/Projects)

README

# Hierarchical Attention Networks for Document Classification

This is an implementation of the paper [Hierarchical Attention Networks for Document Classification](https://www.cs.cmu.edu/~hovy/papers/16HLT-hierarchical-attention-networks.pdf), NAACL 2016.

![alt tag](img/model.png)

## Requirements

- Python 3
- Tensorflow > 1.0
- Pandas
- Nltk
- Tqdm
- [Glove pre-trained word embeddings](http://nlp.stanford.edu/data/glove.6B.zip)

## Data

We use the [data](http://ir.hit.edu.cn/~dytang/paper/emnlp2015/emnlp-2015-data.7z) provided by [Tang et al. 2015](http://ir.hit.edu.cn/~dytang/paper/emnlp2015/emnlp2015.pdf), including 4 datasets:

- IMDB
- Yelp 2013
- Yelp 2014
- Yelp 2015

**Note:**
The original data seems to have an [issue](https://github.com/tqtg/hierarchical-attention-networks/issues/1) with unzipping. I re-uploaded the [data](https://drive.google.com/file/d/1OQ_ggjlNUWiTg_zFXc0_OpYXpJRwJP3y) to GG Drive for better downloading speed. Please request for access permission.

## Usage

First, download the [datasets](#data) and unzip into `data` folder.

Then, run script to prepare the data *(default is using Yelp-2015 dataset)*:

```bash
python data_prepare.py
```

Train and evaluate the model:

*(make sure [Glove embeddings](#requirements) are ready before training)*
```
wget http://nlp.stanford.edu/data/glove.6B.zip
unzip glove.6B.zip
```
```bash
python train.py
```

Print training arguments:

```bash
python train.py --help
```
```
optional arguments:
-h, --help show this help message and exit
--cell_dim CELL_DIM
Hidden dimensions of GRU cells (default: 50)
--att_dim ATTENTION_DIM
Dimensionality of attention spaces (default: 100)
--emb_dim EMBEDDING_DIM
Dimensionality of word embedding (default: 200)
--learning_rate LEARNING_RATE
Learning rate (default: 0.0005)
--max_grad_norm MAX_GRAD_NORM
Maximum value of the global norm of the gradients for clipping (default: 5.0)
--dropout_rate DROPOUT_RATE
Probability of dropping neurons (default: 0.5)
--num_classes NUM_CLASSES
Number of classes (default: 5)
--num_checkpoints NUM_CHECKPOINTS
Number of checkpoints to store (default: 1)
--num_epochs NUM_EPOCHS
Number of training epochs (default: 20)
--batch_size BATCH_SIZE
Batch size (default: 64)
--display_step DISPLAY_STEP
Number of steps to display log into TensorBoard (default: 20)
--allow_soft_placement ALLOW_SOFT_PLACEMENT
Allow device soft device placement
```

## Results

With the *Yelp-2015* dataset, after 5 epochs, we achieved:

- **69.79%** accuracy on the *dev set*
- **69.62%** accuracy on the *test set*

No systematic hyper-parameter tunning was performed. The result reported in the paper is **71.0%** for the *Yelp-2015*.

![alt tag](img/train_log.png)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tqtg/hierarchical-attention-networks

Awesome Lists containing this project

README