https://github.com/samehinttech/sentiment-analysis-customer-reviews

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/samehinttech/sentiment-analysis-customer-reviews
Owner: samehinttech
License: apache-2.0
Created: 2025-12-05T09:14:39.000Z (7 months ago)
Default Branch: main
Last Pushed: 2026-02-20T06:46:58.000Z (4 months ago)
Last Synced: 2026-03-17T02:32:09.883Z (3 months ago)
Language: Jupyter Notebook
Homepage:
Size: 2.58 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# sentiment-analysis-customer-reviews

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
![Python](https://img.shields.io/badge/python-3.12%2B-blue.svg)
[![developed with PyCharm](https://img.shields.io/badge/IDE-PyCharm-green?logo=pycharm&logoColor=white)](https://www.jetbrains.com/pycharm/)
![Jupyter](https://img.shields.io/badge/Notebook-Jupyter-orange.svg)
[![Last Commit](https://img.shields.io/github/last-commit/samehinttech/sentiment-analysis-customer-reviews?color=purple)](https://github.com/samehinttech/sentiment-analysis-customer-reviews/commits/main)

## Project Overview

This repository contains the deliverables for a group project completed by BIT students at the **FHNW University of Applied Sciences and Arts Northwestern Switzerland**.

The project focuses on BI and data analytics solution using a real-world customer feedback dataset. The primary goal is to apply data science and Natural Language Processing (NLP) techniques to extract actionable business insights.

---

## Implementation

### Pipeline Overview

```
Raw Reviews → Text Preprocessing → Feature Extraction → Sentiment Classification → Feature Analysis → Export
```

### Notebook Structure

1. **Part 1** – Libraries Import
2. **Part 2** – Exploratory Data Analysis (EDA)
3. **Part 3** – Text Preprocessing (cleaning, tokenization, lemmatization)
4. **Part 4** – Feature Extraction (TF-IDF vectorization)
5. **Part 5** – Sentiment Classification Models (VADER, NB, LR, BERT)
6. **Part 6** – Topic Modeling (LDA) & Feature-Based Sentiment Analysis
7. **Part 7** – Export Processed Data
8. **Part 8** – Conclusion

---

## Technology Stack

### NLP & Text Processing

- **NLTK** – Tokenization, stopword removal, lemmatization
- **TF-IDF** – Feature extraction for ML models
- **WordCloud** – Vocabulary visualization

### Sentiment Analysis Models

- **VADER** – Rule-based baseline (79.64% accuracy)
- **Naive Bayes** – Classical ML (100% accuracy)
- **Logistic Regression** – Classical ML (100% accuracy)
- **BERT** – Transformer model (91.76% accuracy)

### Topic Modeling

- **LDA (Latent Dirichlet Allocation)** – Discover topics in reviews

### Libraries

- **pandas, numpy** – Data manipulation
- **matplotlib, seaborn** – Visualization
- **scikit-learn** – ML models, TF-IDF, evaluation
- **transformers, torch** – BERT model
- **vaderSentiment** – VADER baseline

---

## Quick Start

### Prerequisites

- Python 3.12+
- NVIDIA GPU (optional, for faster BERT inference)

### Installation

1. **Clone the repository**
```bash
git clone https://github.com/samehinttech/sentiment-analysis-customer-reviews.git
cd sentiment-analysis-customer-reviews
```

2. **Create virtual environment**
```bash
python -m venv .venv
```

3. **Activate virtual environment**
```bash
.\.venv\Scripts\Activate.ps1
```

4. **Install dependencies**
```bash
pip install -r requirements.txt
```

5. **GPU Support (Optional)**
```bash
# For NVIDIA RTX 30/40 series (CUDA 12.4)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

# For NVIDIA RTX 50 series (CUDA 13.0)
pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu130
```

6. **Run the notebook**
```bash
jupyter notebook notebooks/sentiment_analysis.ipynb
```

> **Note:** BERT model downloads automatically on first run (~500MB)
>
> **IMPORTANT NOTE** The notebook is designed to run from start to finish without interruptions.
Please ensure all cells are executed in order for proper functionality.
> Sorry for that but you need to be patient as some steps (like BERT inference) may take time depending on your hardware.

---
## References

### Official Tutorials
- [TensorFlow: Basic Text Classification (Sentiment Analysis)](https://www.tensorflow.org/tutorials/keras/text_classification)
- [TensorFlow: Classify Text with BERT](https://www.tensorflow.org/text/tutorials/classify_text_with_bert)
- [TensorFlow Hub: Text Classification with Movie Reviews](https://www.tensorflow.org/hub/tutorials/tf2_text_classification)
- [Hugging Face: Getting Started with Sentiment Analysis](https://huggingface.co/blog/sentiment-analysis-python)

### Dataset

- [Customer Sentiment Dataset on Kaggle](https://www.kaggle.com/datasets/kundanbedmutha/customer-sentiment-dataset)

### Official Documentation

- [Python Documentation](https://docs.python.org/3.13/contents.html)
- [Pandas Documentation](https://pandas.pydata.org/docs/user_guide/index.html)
- [Seaborn Documentation](https://seaborn.pydata.org/tutorial.html)
- [Matplotlib Documentation](https://matplotlib.org/stable/users/index.html)
- [Scikit-learn Documentation](https://scikit-learn.org/stable/user_guide.html)
- [NLTK Documentation](https://www.nltk.org/)
- [spaCy Documentation](https://spacy.io/usage)
- [Transformers Documentation](https://huggingface.co/docs/transformers/index)
- [Hugging Face Models](https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment)
- [TextBlob Documentation](https://textblob.readthedocs.io/en/dev/quickstart.html#sentiment-analysis)
- [VADER Sentiment Analysis](https://vadersentiment.readthedocs.io/en/latest/pages/features_and_updates.html)

---
## Acknowledgement
We would like to thank our Teacher for his guidance and support throughout
this project. The teaching materials and tutorials provided were instrumental
in completing this work successfully.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/samehinttech/sentiment-analysis-customer-reviews

Awesome Lists containing this project

README