https://github.com/travelxml/amazon-product-reviews-sentiment-analysis-in-python
NPL: Amazon Product Reviews Sentiment Analysis in Python
https://github.com/travelxml/amazon-product-reviews-sentiment-analysis-in-python
ai matplotlib ml nlp nlp-machine-learning nltk numpy pandas pandas-python python3 sentiment-analysis sentiment-classification wordcloud wordcloud-generator wordcloud-visualization
Last synced: 2 months ago
JSON representation
NPL: Amazon Product Reviews Sentiment Analysis in Python
- Host: GitHub
- URL: https://github.com/travelxml/amazon-product-reviews-sentiment-analysis-in-python
- Owner: TravelXML
- Created: 2024-08-20T09:14:50.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2024-08-20T09:56:29.000Z (almost 2 years ago)
- Last Synced: 2025-01-21T01:11:11.398Z (over 1 year ago)
- Topics: ai, matplotlib, ml, nlp, nlp-machine-learning, nltk, numpy, pandas, pandas-python, python3, sentiment-analysis, sentiment-classification, wordcloud, wordcloud-generator, wordcloud-visualization
- Language: Jupyter Notebook
- Homepage: https://apige.medium.com/apache-spark-and-pyspark-on-databricks-a-comprehensive-guide-to-ipl-data-analysis-d0d5e02c861c
- Size: 4.27 MB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# NLP: Amazon Product Reviews Sentiment Analysis in Python

Unlock the insights hidden in Amazon product reviews with this comprehensive sentiment analysis project. By leveraging machine learning and natural language processing (NLP), this project aims to classify reviews as positive or negative, providing valuable insights into customer sentiments.
## 🚀 Project Overview
This repository provides a step-by-step guide to performing sentiment analysis on Amazon product reviews. The project uses a Logistic Regression model trained on pre-processed review text to predict whether a review is positive or negative.
### Key Features:
- **Data Preprocessing**: Clean and prepare raw review text for analysis using Python libraries like `nltk` and `pandas`.
- **Model Training**: Train a Logistic Regression model to classify the sentiment of reviews.
- **Visualization**: Generate word clouds and confusion matrices to visualize the distribution of sentiments and model performance.
- **Evaluation**: Assess model accuracy with metrics like accuracy score and confusion matrix.
## 📂 Repository Structure
- **`az_senti_analysis.ipynb`**: The Jupyter Notebook that contains the full workflow, from data preprocessing to model evaluation.
- **`data/`**: Directory to store the Amazon review dataset.
- **`requirements.txt`**: List of Python libraries required to run the project.
## 🛠️ Installation
### Prerequisites
Make sure you have Python 3.7+ installed. Clone this repository and navigate to its directory:
```bash
git clone https://github.com/TravelXML/Amazon-Product-Reviews-Sentiment-Analysis-in-Python.git
cd Amazon-Product-Reviews-Sentiment-Analysis-in-Python
```
### Install Dependencies
Use pip to install the necessary Python libraries:
```bash
pip install -r requirements.txt
```
## 📊 Usage
1. **Download the Dataset**: Ensure the Amazon product reviews dataset is placed in the `data/` directory. The dataset should be in CSV format.
2. **Run the Notebook**: Open and execute `az_senti_analysis.ipynb` in Jupyter Notebook or JupyterLab to perform sentiment analysis.
3. **Visualize Results**: Explore the generated visualizations to understand the sentiment distribution across the dataset.
## 🎯 Example Outputs
### Word Cloud
Visualize the most frequent words in positive and negative reviews:

### Confusion Matrix
Evaluate model performance with a confusion matrix:

## 🤝 Contributing
Contributions are welcome! Whether it's fixing bugs, improving the documentation, or adding new features, feel free to open a pull request or submit an issue.
## 📧 Contact
For questions or collaborations, reach out via [LinkedIn](https://www.linkedin.com/in/the-startup-cto/).
Happy Coding