https://github.com/amber-abuah/amazon-rating-predictor

MultinomialNB classifier for predicting Amazon review ratings.
https://github.com/amber-abuah/amazon-rating-predictor

beautifulsoup gradio imblearn machine-learning ml naive-bayes-classifier nlp nltk pandas scikit-learn sentiment-analysis sentiment-classification tf-idf

Last synced: 3 months ago
JSON representation

MultinomialNB classifier for predicting Amazon review ratings.

Host: GitHub
URL: https://github.com/amber-abuah/amazon-rating-predictor
Owner: Amber-Abuah
Created: 2024-08-11T20:51:49.000Z (11 months ago)
Default Branch: main
Last Pushed: 2024-08-11T21:43:12.000Z (11 months ago)
Last Synced: 2025-04-07T14:47:20.548Z (3 months ago)
Topics: beautifulsoup, gradio, imblearn, machine-learning, ml, naive-bayes-classifier, nlp, nltk, pandas, scikit-learn, sentiment-analysis, sentiment-classification, tf-idf
Language: Python
Homepage:
Size: 578 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Amazon Rating Predictor
An application that predicts Amazon ratings from live-scraped text reviews using a Multinomial Naive Bayes classifier. Each review is preprocessed, converted into a vector representation using TF/IDF, then its rating is predicted from the classes {1, 2, 3, 4, 5}.

### Handling Unbalanced Data
![](https://github.com/Amber-Abuah/Amazon-Rating-Predictor/blob/main/RatingDistribution.jpg)
The dataset had heavily imbalanced data, with a very large majority of them being 5 star reviews. Because of this the model initially predicted all reviews as 5 stars, no matter the text review. To fix this, SMOTEENN (Synthetic Minority Over-sampling Technique with Edited Nearest Neighbors) was used which generated synthetic samples for underepresented classes and removed samples that could not be predicted by KNN. After applying this technique the model's accuracy increased to 91.46%.

Gradio Deployment: https://huggingface.co/spaces/sweetfelinity/AmazonRatingPredictor

### Libraries Used
BeatifulSoup, NLTK, Gradio, Scikit-Learn, Pandas, Imblearn

amazon_reviews.csv from https://www.kaggle.com/datasets/tarkkaanko/amazon

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/amber-abuah/amazon-rating-predictor

Awesome Lists containing this project

README