Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/philipperemy/sentiment-analysis-nlp

Sentiment Analysis applied to different datasets such as IMDB
https://github.com/philipperemy/sentiment-analysis-nlp

Last synced: 21 days ago
JSON representation

Sentiment Analysis applied to different datasets such as IMDB

Awesome Lists containing this project

README

        

# Simple sentiment analysis examples
Sentiment analysis applied to different datasets such as IMDB. The perceptron implementation uses [Keras](http://keras.io/), a minimalist, highly modular neural networks library, written in Python and capable of running on top of either TensorFlow or Theano.

## Documentation
What is [Sentiment Analysis (Wikipedia)](https://en.wikipedia.org/wiki/Sentiment_analysis) ?

Learn in 30 seconds with [Keras](http://keras.io/#getting-started-30-seconds-to-keras).

## Getting Started
```
git clone https://github.com/philipperemy/Sentiment-Analysis-NLP.git
cd Sentiment-Analysis-NLP
chmod +x init.sh
./init.sh
python main_perceptron.py
```

The script will download the [IMDB Sentiment Database](http://ai.stanford.edu/~amaas/data/sentiment/) from Stanford University and unzip it.

Also, we consider the list of positive and negative words from Illinois University, available [here](https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#lexicon). This list has around 6800 words that we refer as keywords.

## Execution

You should have a console output similar to the one below.

```
Using TensorFlow backend.
=> loaded 2006 positive words
=> loaded 4783 negative words
=> processed 500 aclImdb/train/neg files.
[...]
=> processed 5000 aclImdb/train/neg files.
=> processed 500 aclImdb/train/pos files.
[...]
=> processed 5000 aclImdb/train/pos files.
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
Epoch 1/200
6666/6666 [==============================] - 22s - loss: 0.2177
[...]
```

## Results (accuracy)

On 10k reviews with 2/3 training and 1/3 validation set (200 epochs).

This result is very experimental and can be drastically increased.
```
training= 0.803180318032
validation= 0.732913669065
```

## Visualize the graph (tensorflow feature)
Make sure the backend of Keras is tensorflow. You can check it easily by

```
cat ~/.keras/keras.json
{"epsilon": 1e-07, "floatx": "float32", "backend": "tensorflow"}
```

Finally run the tensorboard command with specifying the log dir.
```
tensorboard --logdir=/Users/philipperemy/PycharmProjects/Sentiment-Analysis-NLP/deep_learning/runs/1455792487
```