https://github.com/AlexGidiotis/Document-Classifier-LSTM
A bidirectional LSTM with attention for multiclass/multilabel text classification.
https://github.com/AlexGidiotis/Document-Classifier-LSTM
arxiv attention-mechanism hierarchical-attention-networks keras lstm multilabel-multiclass recurrent-neural-networks tensorflow text-classification
Last synced: about 2 months ago
JSON representation
A bidirectional LSTM with attention for multiclass/multilabel text classification.
- Host: GitHub
- URL: https://github.com/AlexGidiotis/Document-Classifier-LSTM
- Owner: AlexGidiotis
- License: mit
- Created: 2017-08-20T17:32:57.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2024-08-30T23:53:12.000Z (9 months ago)
- Last Synced: 2024-11-06T02:38:46.922Z (7 months ago)
- Topics: arxiv, attention-mechanism, hierarchical-attention-networks, keras, lstm, multilabel-multiclass, recurrent-neural-networks, tensorflow, text-classification
- Language: Python
- Homepage:
- Size: 82 KB
- Stars: 171
- Watchers: 6
- Forks: 52
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Document-Classifier-LSTM
Recurrent Neural Networks for multilclass, multilabel classification of texts. The models that learn to tag samll texts with 169 different tags from arxiv.In classifier.py is implemented a standard BLSTM network with attention.
In hatt_classifier.py you can find the implementation of [Hierarchical Attention Networks for Document Classification](https://www.cs.cmu.edu/~hovy/papers/16HLT-hierarchical-attention-networks.pdf).
The neural networks were built using Keras and Tensorflow.
The best performing model is the attention BLSTM that achieves a micro f-score of 0.67 on the test set.
The Hierarchical Attention Network achieves only 0.65 micro f-score.
I am using 500k paper abstracts from arxiv. In order to download your own data refer to the [arxiv OAI api](https://arxiv.org/help/bulk_data).
Pretrained word embeddings can be used. The embeddings can either be GloVe or Word2Vec. You can download the [GoogleNews-vectors-negative300.bin](https://code.google.com/archive/p/word2vec) or the [GloVe embeddings](https://nlp.stanford.edu/projects/glove).
## Usage:
1) In order to train your own model you must prepare your data set using the data_prep.py script. The preprocessing converts to lower case, tokenizes and removes very short words. The preprocessed files and label files should be saved in a /data folder.
2) You can now run classifier.py or hatt_classifier.py to build and train the models.
3) The trained models are exported to json and the weights to h5 for later use.
4) You can use utils.visualize_attention to visualize the attention weights.
## Requirements
- Python
- NLTK
- NumPy
- Pandas
- SciPy
- OpenCV
- scikit-learn
- [Tensorflow](https://github.com/tensorflow/tensorflow)
- [Keras](https://github.com/fchollet/keras)Run `pip install -r requirements.txt` to install the requirements.