Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/laudebugs/sentiment-analysis-on-movie-reviews

For the final project of the NLP class, this project will consider performing sentiment analysis on movie reviews
https://github.com/laudebugs/sentiment-analysis-on-movie-reviews

machine-learning natural-language-processing python sentiment-analysis stanford-corenlp

Last synced: 5 days ago
JSON representation

For the final project of the NLP class, this project will consider performing sentiment analysis on movie reviews

Host: GitHub
URL: https://github.com/laudebugs/sentiment-analysis-on-movie-reviews
Owner: laudebugs
Created: 2019-12-01T20:02:38.000Z (almost 5 years ago)
Default Branch: master
Last Pushed: 2023-12-15T14:29:25.000Z (11 months ago)
Last Synced: 2024-04-17T04:58:48.173Z (7 months ago)
Topics: machine-learning, natural-language-processing, python, sentiment-analysis, stanford-corenlp
Language: Java
Homepage:
Size: 2.95 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# 🧮Sentiment Analysis on Movie Reviews

**[a challenge on Kaggle](https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews)**

## Project Brief

In order to classify sentences on a five point scale: `negative, slightly negative, neutral, slightly positive, positive`, we are developing a sentiment analysis classifier to determine the sentiment of a piece of text.

The current Naive Bayes classifier, found in the baseline directory achieves a simple classification based on whether or not a words with a particular sentiment appears in the sentence

To run the Naive Bayes Classifier on the development set:

```bash
# this will output a file in the folder outputs called NB_file_output.csv
python3 baselines/baselineNB.py dataset/development.tsv
```

To evaluate the algorithm against the development set:

```bash
python3 evaluation/evaluate_f_score.py outputs/NB_file_output.tsv dataset/devanskey.tsv
```

## Results

Current results show that the baseline algorithm has an F-Score of 28.41% against the development set and an accuracy score of 51.24% when run against the test set and evaluated on Kaggle
![Naive Baseline results](evaluation/NaiveBayesBaselineResults.jpg)

## TODO

- [ ] Add and test more features to the Naive Bayes Classifier
- [ ] Combine the baseline algorithm with others to improve score