https://github.com/manulthanura/reviewclassifier
ReviewClassifier is end-to-end machine learning project that can predict a input product review positive or negative.
https://github.com/manulthanura/reviewclassifier
bootstrap5 classification end-to-end flask logistic-regression machine-learning python review semantic-segmentation
Last synced: about 1 year ago
JSON representation
ReviewClassifier is end-to-end machine learning project that can predict a input product review positive or negative.
- Host: GitHub
- URL: https://github.com/manulthanura/reviewclassifier
- Owner: manulthanura
- License: mit
- Created: 2024-06-10T10:06:12.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2025-04-14T13:22:26.000Z (about 1 year ago)
- Last Synced: 2025-04-15T14:13:36.585Z (about 1 year ago)
- Topics: bootstrap5, classification, end-to-end, flask, logistic-regression, machine-learning, python, review, semantic-segmentation
- Language: Jupyter Notebook
- Homepage: https://reviewclassifier.azurewebsites.net
- Size: 1.89 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ReviewClassifier
ReviewClassifier is a web application that classifies reviews into positive and negative categories. The application uses a machine learning model to classify the reviews. The model is trained using a dataset of reviews that are labeled as positive and negative. The model is then used to predict the sentiment of new reviews. The application is built using Flask, a Python web framework, and deployed on Azure, a cloud platform as a service. The application is designed to be simple and easy to use. Users can enter a review in a text box and click a button to classify the review. The application will then display the predicted sentiment of the review.

   
## Problem statement
The aim of this project is to build a end-to-end machine learning project that can predict a input product review positive or negative.
## Installation
Create a virtual environment and install the required packages using the following command:
- Create a virtual environment: `python -m venv env`
- Activate the virtual environment: `source env/bin/activate`
- Install the required packages: `pip install -r requirements.txt`
## Download dataset
View data - https://www.kaggle.com/datasets/dineshpiyasamara/sentiment-analysis-dataset
Command to download data:
`dineshpiyasamara/sentiment-analysis-dataset` or `kaggle datasets download -d dineshpiyasamara/sentiment-analysis-dataset`
## Data Preprocessing
To preprocess the data we can use several techniques like removing stopwords, punctuation, and lemmatization.
in this project I use the following techniques:
- Remove stopwords
- Remove punctuation
- Stemming
- Remove special characters
- Remove numbers
- Remove links
- Convert to lowercase
## Vocabulary Building
In this step we will build a vocabulary from the dataset and vectorize the text data.
## Model Building
For build the model first we need to check the data is balanced or not. If the data is not balanced we need to balance the data using `imbalanced-learn` library.
Since this is a binary classification problem we can use several algorithms like:
- Logistic Regression
- Random Forest
- Decision Tree
- Support Vector Machine
- Naive Bayes
### Model Evaluation
To evaluate the model we can use several metrics like:
- Accuracy
- Precision
- Recall
- F1 Score
### Model Deployment
I use all the above algorithms and compare the accuracy of each model. To go ahead I choose logistic regression because it gives the best accuracy.
### Prediction Pipeline
After building the model we need to create a prediction pipeline that can predict the input review is positive or negative.
## Build Website
To build a website we can use Flask or Django. In this project, I use Flask to build a website. Also, use bootstrap for the styling of the website.
## Deployment
Before deploying project make sure to remove following libraries from requirements.txt file:
`pywin32==306` and `pywinpty==2.0.13`
Generally, these libraries are used for windows and not required for deployment. Mos of the cloud platforms like Heroku, Azure, AWS, etc. are Linux based. For the deployment, I use Azure.
## Usefull Commands
- To install libraries: `pip install -r requirements.txt`
- View installed libraries: `pip list`
- Open jupyter notebook: `jupyter notebook` or `python -m notebook`
- To create a requirements file: `pip freeze > requirements.txt`
- To run the flask app: `python app.py`
- Check python version: `python --version`
## File Structure
- `app.py`: Main file to run the flask app
- `handler.py`: File to handle the prediction
- `logger.py`: File to log the information
- `templates/`: Folder contains the html files
- `static/`: Folder contains the css and js files
- `notebooks/`: Folder contains the jupyter notebooks
- `artifacts/`: Folder contains the dataset
## Conclusion
This is end-to-end machine learning project that can predict the input review is positive or negative. The project is deployed on Azure and can be accessed using the following link: https://reviewclassifier.azurewebsites.net/
### Give a star if you like the project. Follow me for more updates!