Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/AnubhavGupta3377/Text-Classification-Models-Pytorch

Implementation of State-of-the-art Text Classification Models in Pytorch
https://github.com/AnubhavGupta3377/Text-Classification-Models-Pytorch

attention classification convolutional-neural-networks deep-learning fasttext nlp pytorch rcnn recurrent-neural-networks seq2seq transformer

Last synced: about 2 months ago
JSON representation

Implementation of State-of-the-art Text Classification Models in Pytorch

Awesome Lists containing this project

README

        

# Text-Classification-Models-Pytorch
Implementation of State-of-the-art Text Classification Models in Pytorch

## Implemented Models
- **fastText:** fastText Model from [Bag of Tricks for Efficient Text Classification](https://arxiv.org/abs/1607.01759)
- **TextCNN:** CNN for text classification proposed in [Convolutional Neural Networks for Sentence Classification](https://arxiv.org/abs/1408.5882)
- **TextRNN:** Bi-direction LSTM network for text classification
- **RCNN:** Implementation of RCNN Model proposed in [Recurrent Convolutional Neural Networks for Text Classification](https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/download/9745/9552)
- **CharCNN:** Implementation of character-level CNNs as proposed in the paper [Character-level Convolutional Networks for Text Classification](https://papers.nips.cc/paper/5782-character-level-convolutional-networks-for-text-classification.pdf)
- **Seq2seq With Attention:** Implementation of seq2seq model with attention from [Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/pdf/1409.0473.pdf), [Text Classification Research with Attention-based Recurrent Neural Networks](http://univagora.ro/jour/index.php/ijccc/article/download/3142/pdf)
- **Transformer:** Implementation of Transformer model proposed in [Attention Is All You Need](https://arxiv.org/abs/1706.03762)

## Requirements
- Python-3.5.0
- Pandas-0.23.4
- Numpy-1.15.2
- Spacy-2.0.13
- Pytorch-0.4.1.post2
- Torchtext-0.3.1

## Usage
1) Download data into "data/" directory or use already available data
2) If using your own data, convert it into the same format as of provided data
3) Download Pre-trained word embeddings (Glove/Word2Vec) into "data/" directory
4) Go to corresponding model directory
5) run following command:

python train.py

## Model Performance
- All the models were run on a 14GB machine with 2 Cores and one NVIDIA Tesla K80 GPU.
- Runtime in the table below includes the time to load and process the data and running the model.
- Model parameters are not tuned. So, better performance can be achieved by some parameter tuning.


Model
Dataset


AG_News
Query_Well_formedness


Accuracy (%)
Runtime
Accuracy (%)
Runtime


fastText
89.46
16.0 Mins
62.10
7.0 Mins


TextCNN
88.57
17.2 Mins
67.38
7.43 Mins


TextRNN
88.07 (Seq len = 20)
90.43 (Flexible seq len)
21.5 Mins
36.8 Mins
68.29
66.29
7.69 Mins
7.25 Mins


RCNN
90.61
22.73 Mins
66.70
7.21 Mins


CharCNN
87.70
13.08 Mins
68.83
2.49 Mins


Seq2Seq_Attention
90.26
19.10 Mins
67.84
7.36 Mins


Transformer
88.54
46.47 Mins
63.43
5.77 Mins

## References
[1] [Bag of Tricks for Efficient Text Classification](https://arxiv.org/abs/1607.01759)
[2] [Convolutional Neural Networks for Sentence Classification](https://arxiv.org/abs/1408.5882)
[3] [Recurrent Convolutional Neural Networks for Text Classification](https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/download/9745/9552)
[4] [Character-level Convolutional Networks for Text Classification](https://papers.nips.cc/paper/5782-character-level-convolutional-networks-for-text-classification.pdf)
[5] [Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/pdf/1409.0473.pdf)
[6] [Text Classification Research with Attention-based Recurrent Neural Networks](http://univagora.ro/jour/index.php/ijccc/article/download/3142/pdf)
[7] [Attention Is All You Need](https://arxiv.org/abs/1706.03762)
[8] [Rethinking the Inception Architecture for Computer Vision](https://arxiv.org/pdf/1705.03122.pdf)
[9] [Identifying Well-formed Natural Language Questions](https://arxiv.org/pdf/1808.09419.pdf)