Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mlh-fellowship/0.1.2-sentiment-analysis-visualization

Machine Learning Web Application. Helps to visualize a character-by-character breakdown of how sentiment analysis classifies text
https://github.com/mlh-fellowship/0.1.2-sentiment-analysis-visualization

bentoml keras keras-neural-networks lstm lstm-sentiment-analysis machine-learning machine-learning-algorithms sentiment-analysis-visualization visualizations

Last synced: about 1 month ago
JSON representation

Machine Learning Web Application. Helps to visualize a character-by-character breakdown of how sentiment analysis classifies text

Awesome Lists containing this project

README

        

# Sentiment Analysis Visualization

[![Status](https://img.shields.io/badge/status-active-success.svg)]()
[![GitHub Issues](https://img.shields.io/github/issues/MLH-Fellowship/0.1.2-sentiment-analysis-visualization.svg)](https://github.com/MLH-Fellowship/0.1.2-sentiment-analysis-visualization/issues)
[![GitHub Pull Requests](https://img.shields.io/github/issues-pr/MLH-Fellowship/0.1.2-sentiment-analysis-visualization.svg)](https://github.com/MLH-Fellowship/0.1.2-sentiment-analysis-visualization/pulls)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

-------

## Pod 0.1.2
A web-app that helps to visualize a word-by-word breakdown of how sentiment analysis classifies text

![Frontend View](https://user-images.githubusercontent.com/23178940/83907584-858fbf00-a71a-11ea-8476-7445c0e16ffe.png)

-------

## Major goals
- [x] Research and decide on a machine learning model/architecture
- [x] Pick out 2-3 datasets we can use to train
- [x] Build a training pipeline
- [x] Train and implement the model
- [x] Serve the model using BentoML as an API
- [x] Create a web app to take in input and visualize the output

-------

## Calling the api
Our endpoint is at https://sentiment-classifier-gy7t3p45oq-uc.a.run.app/
Our prediction endpoint can be accessed through making a `POST` request to `https://sentiment-classifier-gy7t3p45oq-uc.a.run.app/predict`.

```bash
# e.g.
curl -X POST "https://sentiment-classifier-gy7t3p45oq-uc.a.run.app/predict" \
-H "accept: */*" -H "Content-Type: application/json" \
-d "{\"text\":\"Some example text.\"}"
```

Basically, make sure to set the content type to JSON and send a JSON in the format
```json
{
"text": "content"
}
```

If successful, you should get a `200 OK` status and a body with something along the lines of `[[0.8614905476570129], [0.7018478512763977], [0.617088258266449]]` where each entry represents the sentiment from 0 (negative) to 1 (positive) of each word.

-------

## Training a new model
Currently, we have only implemented a training pipeline for the IMDB dataset but this is subject to change in the future. You can train a new classifier on the dataset by doing

```bash
python train.py
```
This will replace the current model in `/model`. `model.json` stores the model architecture, `weights.h5` stores trained weights, and `tokenizer.json` stores word indices.

-------

## Packaging it with bentoML
BentoML helps us to easily serve our Keras model through an API. You can package a new API by running

```python
python bento_service_packager.py
> ...
> [0.07744759]
> [0.1166597 ]
> [0.18447165]
> [0.20329727]
> [0.24308157]
> [0.25030023]]
> _____
> saved model path: /Users/jzhao/bentoml/repository/SentimentClassifierService/20200604214004_F641D2
```
If you'd like to save the packaged API, just copy the contents into `/bento_deploy`

```bash
cp -r /Users/jzhao/bentoml/repository/SentimentClassifierService/20200604214004_F641D2/* bento_deploy
# or whatever the autogenerated URI is
```

There are a few dependency nuances to be aware of before building the actual Docker image. To make sure the build doesn't error out, edit `bento_deploy/requirements.txt` is

```pip
tensorflow==2.1.0
sklearn
bentoml==0.7.8
```

Then, we can build and push and run the image as follows

```bash
docker build -t bento-classifier:latest .
docker run -p 5000:5000 bento-classifier:latest
```

Then, visit `localhost:5000` to see the BentoML server!

-------

## Simple deep LSTM architecture
```python
> model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 100, 64) 320000
_________________________________________________________________
lstm (LSTM) (None, 100, 64) 33024
_________________________________________________________________
dropout (Dropout) (None, 100, 64) 0
_________________________________________________________________
lstm_1 (LSTM) (None, 64) 33024
_________________________________________________________________
FC1 (Dense) (None, 256) 16640
_________________________________________________________________
dropout_1 (Dropout) (None, 256) 0
_________________________________________________________________
out_layer (Dense) (None, 1) 257
_________________________________________________________________
activation (Activation) (None, 1) 0
=================================================================
Total params: 402,945
Trainable params: 402,945
Non-trainable params: 0
_________________________________________________________________
```

-------

## Data and training process
* 85% / 15% train-test split
* dataset is balanced (25k positive, 25k negative)
* RMSProp with 1e-3 Learning Rate and early stopping with patience of 2 epochs
* preprocessing
* to lowercase
* removed punctuation
* removed `
` tags
* tokenized with vocab size of 5k
* max sequence length of 100
* achieved 82.2% accuracy