Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yoonkim/CNN_sentence
CNNs for sentence classification
https://github.com/yoonkim/CNN_sentence
Last synced: 8 days ago
JSON representation
CNNs for sentence classification
- Host: GitHub
- URL: https://github.com/yoonkim/CNN_sentence
- Owner: yoonkim
- Created: 2014-12-04T03:33:02.000Z (almost 10 years ago)
- Default Branch: master
- Last Pushed: 2018-04-26T11:15:57.000Z (over 6 years ago)
- Last Synced: 2024-10-29T17:49:35.364Z (15 days ago)
- Language: Python
- Size: 494 KB
- Stars: 2,050
- Watchers: 86
- Forks: 826
- Open Issues: 42
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Convolutional Neural Networks for Sentence Classification
Code for the paper [Convolutional Neural Networks for Sentence Classification](http://arxiv.org/abs/1408.5882) (EMNLP 2014).Runs the model on Pang and Lee's movie review dataset (MR in the paper).
Please cite the original paper when using the data.### Requirements
Code is written in Python (2.7) and requires Theano (0.7).Using the pre-trained `word2vec` vectors will also require downloading the binary file from
https://code.google.com/p/word2vec/### Data Preprocessing
To process the raw data, run```
python process_data.py path
```where path points to the word2vec binary file (i.e. `GoogleNews-vectors-negative300.bin` file).
This will create a pickle object called `mr.p` in the same folder, which contains the dataset
in the right format.Note: This will create the dataset with different fold-assignments than was used in the paper.
You should still be getting a CV score of >81% with CNN-nonstatic model, though.### Running the models (CPU)
Example commands:```
THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python conv_net_sentence.py -nonstatic -rand
THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python conv_net_sentence.py -static -word2vec
THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python conv_net_sentence.py -nonstatic -word2vec
```This will run the CNN-rand, CNN-static, and CNN-nonstatic models respectively in the paper.
### Using the GPU
GPU will result in a good 10x to 20x speed-up, so it is highly recommended.
To use the GPU, simply change `device=cpu` to `device=gpu` (or whichever gpu you are using).
For example:
```
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python conv_net_sentence.py -nonstatic -word2vec
```### Example output
CPU output:
```
epoch: 1, training time: 219.72 secs, train perf: 81.79 %, val perf: 79.26 %
epoch: 2, training time: 219.55 secs, train perf: 82.64 %, val perf: 76.84 %
epoch: 3, training time: 219.54 secs, train perf: 92.06 %, val perf: 80.95 %
```
GPU output:
```
epoch: 1, training time: 16.49 secs, train perf: 81.80 %, val perf: 78.32 %
epoch: 2, training time: 16.12 secs, train perf: 82.53 %, val perf: 76.74 %
epoch: 3, training time: 16.16 secs, train perf: 91.87 %, val perf: 81.37 %
```### Other Implementations
#### TensorFlow
[Denny Britz](http://www.wildml.com) has an implementation of the model in TensorFlow:https://github.com/dennybritz/cnn-text-classification-tf
He also wrote a [nice tutorial](http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow) on it, as well as a general tutorial on [CNNs for NLP](http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp).
#### Torch
[HarvardNLP](http://harvardnlp.github.io/) group has an implementation in Torch.https://github.com/harvardnlp/sent-conv-torch
### Hyperparameters
At the time of my original experiments I did not have access to a GPU so I could not run a lot of different experiments.
Hence the paper is missing a lot of things like ablation studies and variance in performance, and some of the conclusions
were premature (e.g. regularization does not always seem to help).Ye Zhang has written a [very nice paper](http://arxiv.org/abs/1510.03820) doing an extensive analysis of model variants (e.g. filter widths, k-max pooling, word2vec vs Glove, etc.) and their effect on performance.