https://github.com/mtgsoftworks/sentiment-analysis

Sentiment analysis project using NLP to classify Amazon reviews of products based on customer sentiment
https://github.com/mtgsoftworks/sentiment-analysis

matplotlib nltk pandas python3 sklearn textblob-sentiment-analysis wordcloud

Last synced: 3 months ago
JSON representation

Sentiment analysis project using NLP to classify Amazon reviews of products based on customer sentiment

Host: GitHub
URL: https://github.com/mtgsoftworks/sentiment-analysis
Owner: mtgsoftworks
License: mit
Created: 2024-06-03T22:20:25.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-11-04T21:27:18.000Z (8 months ago)
Last Synced: 2025-01-18T02:19:15.630Z (5 months ago)
Topics: matplotlib, nltk, pandas, python3, sklearn, textblob-sentiment-analysis, wordcloud
Language: Python
Homepage:
Size: 473 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Sentiment Analysis for Amazon Reviews

## 1. Introduction

This project involves sentiment analysis of Amazon reviews, focusing on home textiles and casual clothing. The objective is to classify reviews by sentiment, allowing for enhanced product insights and sentiment prediction for future reviews.

---

## 2. Business Problem

By analyzing customer reviews, the company aims to:

- Improve product features and customer satisfaction.

- Increase sales by addressing customer feedback and identifying areas of improvement.

---

## 3. Dataset Description

The dataset, provided in an Excel file (`amazon.xlsx`), contains reviews for specific product groups with the following fields:

- **Review**: Content of the review.

- **Title**: Short title or comment for the review.

- **Helpful**: Number of users who found the review helpful.

- **Star**: Star rating given to the product.

---

## 4. Project Workflow

The project includes the following key steps:

1. **Text Preprocessing**: Prepare text data for analysis by cleaning and structuring.

2. **Text Visualization**: Visualize word frequency in reviews to identify common themes.

3. **Sentiment Modeling**: Label and classify reviews based on sentiment.

4. **Model Evaluation**: Evaluate the model's performance.

---

## 5. Detailed Steps

### 5.1. Import Necessary Libraries

The following libraries are used in this project:

- `pandas`

- `numpy`

- `re`

- `string`

- `sklearn`

- `matplotlib`

- `seaborn`

- `nltk`

### 5.2. Text Preprocessing

To prepare the `Review` text data for analysis, the following preprocessing steps are applied:

1. Convert text to lowercase.

2. Remove punctuation.

3. Remove numbers.

4. Remove stopwords.

5. Lemmatize words.

---

### 5.3. Text Visualization

Calculate the frequency of words in the processed reviews and create a bar plot to display the top 20 most common words.

---

### 5.4. Sentiment Modeling

1. **Label Reviews**: Label each review as positive, neutral, or negative based on its star rating.

2. **Data Splitting**: Divide the data into training and testing sets.

3. **Vectorization**: Vectorize the text data.

4. **Model Training**: Train a logistic regression model to classify sentiment.

---

### 5.5. Model Evaluation

Evaluate the model’s performance by:

- Calculating accuracy.

- Printing a classification report.

- Displaying a confusion matrix.

---

## 6. Conclusion

This project delivers insights into customer sentiment on Amazon, providing actionable feedback for product improvement. By preprocessing text data, visualizing key words, building a sentiment classification model, and evaluating its performance, the company gains a valuable tool for predicting sentiment in future reviews. The logistic regression model serves as a robust predictor based on historical data patterns.

---

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mtgsoftworks/sentiment-analysis

Awesome Lists containing this project

README