Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/reshalfahsi/movie-review-sentiment-analysis
Movie Review Sentiment Analysis Using CNN and MLP
https://github.com/reshalfahsi/movie-review-sentiment-analysis
cnn-model imdb-dataset sentiment-analysis text-classification
Last synced: about 6 hours ago
JSON representation
Movie Review Sentiment Analysis Using CNN and MLP
- Host: GitHub
- URL: https://github.com/reshalfahsi/movie-review-sentiment-analysis
- Owner: reshalfahsi
- Created: 2023-06-26T08:33:52.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2023-09-30T23:02:43.000Z (about 1 year ago)
- Last Synced: 2023-10-01T00:18:35.858Z (about 1 year ago)
- Topics: cnn-model, imdb-dataset, sentiment-analysis, text-classification
- Language: Jupyter Notebook
- Homepage:
- Size: 17.3 MB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Movie Review Sentiment Analysis Using CNN and MLP
A diagram of the CNN-MLP model.
Audiences' reactions to the movie they have watched can be presented in a text format called reviews. These reviews can be polarized into two clusters: positive responses and negative responses. Using CNN and MLP, one can perform sentiment analysis on movie reviews to automatically recognize the viewer's tendency toward a particular movie. CNN is used for extracting the latent information within the text format. MLP leverages the extracted information of CNN and carries out the classification task. The CNN-MLP model is evaluated with Standford's IMBD Movie Review dataset. On the test set, the model achieves ``85.6%`` accuracy.
## Experiment
To run the experiment, [click here](https://github.com/reshalfahsi/movie-review-sentiment-analysis/blob/master/Movie_Review_Sentiment_Analysis_Using_CNN_and_MLP.ipynb).
## Result
### Quantitative Result
The model's performance is measured by the accuracy and loss value. These values are computed regarding the sentiment analysis of the model on the dataset. The loss value is calculated based on the loss function employed in the training process which is the cross-entropy loss.Dataset Split | Accuracy | Loss
------------ | ------------- | -------------
Train | 94.1% | 0.347
Validation | **96.5%** | **0.331**
Test | 85.6% | 0.454### Accuracy and Loss Curve
Accuracy curve on the train set and the validation set.
Loss curve on the train set and the validation set.### CNN Visualization
To understand what CNN sees in deciding whether it is a positive or negative opinion, the GradCAM and TAHV visualization techniques are employed.
#### Positive Review
Visualization of the first layer of CNN on the positive review.
Visualization of the second layer of CNN on the positive review.
Visualization of the third layer of CNN on the positive review.
#### Negative Review
Visualization of the first layer of CNN on the negative review.
Visualization of the second layer of CNN on the negative review.
Visualization of the third layer of CNN on the negative review.
## Credit
- [Text Classification from Scratch](https://keras.io/examples/nlp/text_classification_from_scratch)
- [Standford's IMBD Movie Review](https://ai.stanford.edu/~amaas/data/sentiment/)
- [VLC University of Southampton GradCAM](https://colab.research.google.com/github/ecs-vlc/fmix/blob/master/notebooks/grad_cam.ipynb)
- [University of Toronto GradCAM](https://colab.research.google.com/github/csc413-uoft/2021/blob/master/assets/tutorials/tut04_cnn.ipynb)
- [TAHV:Text Attention Heatmap Visualization](https://github.com/jiesutd/Text-Attention-Heatmap-Visualization)