Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/laudebugs/sentiment-analysis-on-movie-reviews
For the final project of the NLP class, this project will consider performing sentiment analysis on movie reviews
https://github.com/laudebugs/sentiment-analysis-on-movie-reviews
machine-learning natural-language-processing python sentiment-analysis stanford-corenlp
Last synced: 5 days ago
JSON representation
For the final project of the NLP class, this project will consider performing sentiment analysis on movie reviews
- Host: GitHub
- URL: https://github.com/laudebugs/sentiment-analysis-on-movie-reviews
- Owner: laudebugs
- Created: 2019-12-01T20:02:38.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2023-12-15T14:29:25.000Z (11 months ago)
- Last Synced: 2024-04-17T04:58:48.173Z (7 months ago)
- Topics: machine-learning, natural-language-processing, python, sentiment-analysis, stanford-corenlp
- Language: Java
- Homepage:
- Size: 2.95 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🧮Sentiment Analysis on Movie Reviews
**[a challenge on Kaggle](https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews)**
## Project Brief
In order to classify sentences on a five point scale: `negative, slightly negative, neutral, slightly positive, positive`, we are developing a sentiment analysis classifier to determine the sentiment of a piece of text.
The current Naive Bayes classifier, found in the baseline directory achieves a simple classification based on whether or not a words with a particular sentiment appears in the sentence
To run the Naive Bayes Classifier on the development set:
```bash
# this will output a file in the folder outputs called NB_file_output.csv
python3 baselines/baselineNB.py dataset/development.tsv
```To evaluate the algorithm against the development set:
```bash
python3 evaluation/evaluate_f_score.py outputs/NB_file_output.tsv dataset/devanskey.tsv
```## Results
Current results show that the baseline algorithm has an F-Score of 28.41% against the development set and an accuracy score of 51.24% when run against the test set and evaluated on Kaggle
![Naive Baseline results](evaluation/NaiveBayesBaselineResults.jpg)## TODO
- [ ] Add and test more features to the Naive Bayes Classifier
- [ ] Combine the baseline algorithm with others to improve score