{"id":49817598,"url":"https://github.com/samehinttech/sentiment-analysis-customer-reviews","last_synced_at":"2026-05-13T08:11:10.138Z","repository":{"id":328093066,"uuid":"1110488439","full_name":"samehinttech/sentiment-analysis-customer-reviews","owner":"samehinttech","description":null,"archived":false,"fork":false,"pushed_at":"2026-02-20T06:46:58.000Z","size":2710,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-17T02:32:09.883Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/samehinttech.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-05T09:14:39.000Z","updated_at":"2025-12-13T17:31:52.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/samehinttech/sentiment-analysis-customer-reviews","commit_stats":null,"previous_names":["samehinttech/sentiment-analysis-customer-reviews"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/samehinttech/sentiment-analysis-customer-reviews","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samehinttech%2Fsentiment-analysis-customer-reviews","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samehinttech%2Fsentiment-analysis-customer-reviews/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samehinttech%2Fsentiment-analysis-customer-reviews/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samehinttech%2Fsentiment-analysis-customer-reviews/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/samehinttech","download_url":"https://codeload.github.com/samehinttech/sentiment-analysis-customer-reviews/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samehinttech%2Fsentiment-analysis-customer-reviews/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32973443,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-13T06:31:55.726Z","status":"ssl_error","status_checked_at":"2026-05-13T06:31:51.336Z","response_time":115,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-13T08:11:09.374Z","updated_at":"2026-05-13T08:11:10.130Z","avatar_url":"https://github.com/samehinttech.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# sentiment-analysis-customer-reviews\n\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)\n![Python](https://img.shields.io/badge/python-3.12%2B-blue.svg)\n[![developed with PyCharm](https://img.shields.io/badge/IDE-PyCharm-green?logo=pycharm\u0026logoColor=white)](https://www.jetbrains.com/pycharm/)\n![Jupyter](https://img.shields.io/badge/Notebook-Jupyter-orange.svg)\n[![Last Commit](https://img.shields.io/github/last-commit/samehinttech/sentiment-analysis-customer-reviews?color=purple)](https://github.com/samehinttech/sentiment-analysis-customer-reviews/commits/main)\n\n## Project Overview\n\nThis repository contains the deliverables for a group project completed by BIT students at the **FHNW University of Applied Sciences and Arts Northwestern Switzerland**.\n\nThe project focuses on BI and data analytics solution using a real-world customer feedback dataset. The primary goal is to apply data science and Natural Language Processing (NLP) techniques to extract actionable business insights.\n\n---\n\n## Implementation\n\n### Pipeline Overview\n\n```\nRaw Reviews → Text Preprocessing → Feature Extraction → Sentiment Classification → Feature Analysis → Export\n```\n\n### Notebook Structure\n\n1. **Part 1** – Libraries Import\n2. **Part 2** – Exploratory Data Analysis (EDA)\n3. **Part 3** – Text Preprocessing (cleaning, tokenization, lemmatization)\n4. **Part 4** – Feature Extraction (TF-IDF vectorization)\n5. **Part 5** – Sentiment Classification Models (VADER, NB, LR, BERT)\n6. **Part 6** – Topic Modeling (LDA) \u0026 Feature-Based Sentiment Analysis\n7. **Part 7** – Export Processed Data\n8. **Part 8** – Conclusion\n\n---\n\n## Technology Stack\n\n### NLP \u0026 Text Processing\n\n- **NLTK** – Tokenization, stopword removal, lemmatization\n- **TF-IDF** – Feature extraction for ML models\n- **WordCloud** – Vocabulary visualization\n\n### Sentiment Analysis Models\n\n- **VADER** – Rule-based baseline (79.64% accuracy)\n- **Naive Bayes** – Classical ML (100% accuracy)\n- **Logistic Regression** – Classical ML (100% accuracy)\n- **BERT** – Transformer model (91.76% accuracy)\n\n### Topic Modeling\n\n- **LDA (Latent Dirichlet Allocation)** – Discover topics in reviews\n\n### Libraries\n\n- **pandas, numpy** – Data manipulation\n- **matplotlib, seaborn** – Visualization\n- **scikit-learn** – ML models, TF-IDF, evaluation\n- **transformers, torch** – BERT model\n- **vaderSentiment** – VADER baseline\n\n---\n\n## Quick Start\n\n### Prerequisites\n\n- Python 3.12+\n- NVIDIA GPU (optional, for faster BERT inference)\n\n### Installation\n\n1. **Clone the repository**\n   ```bash\n   git clone https://github.com/samehinttech/sentiment-analysis-customer-reviews.git\n   cd sentiment-analysis-customer-reviews\n   ```\n\n2. **Create virtual environment**\n   ```bash\n   python -m venv .venv\n   ```\n\n3. **Activate virtual environment**\n   ```bash\n   .\\.venv\\Scripts\\Activate.ps1\n   ```\n\n4. **Install dependencies**\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n5. **GPU Support (Optional)**\n   ```bash\n   # For NVIDIA RTX 30/40 series (CUDA 12.4)\n   pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124\n   \n   # For NVIDIA RTX 50 series (CUDA 13.0)\n   pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu130\n   ```\n\n6. **Run the notebook**\n   ```bash\n   jupyter notebook notebooks/sentiment_analysis.ipynb\n   ```\n\n\u003e **Note:** BERT model downloads automatically on first run (~500MB)\n\u003e \n\u003e **IMPORTANT NOTE** The notebook is designed to run from start to finish without interruptions.\n Please ensure all cells are executed in order for proper functionality.\n\u003e Sorry for that but you need to be patient as some steps (like BERT inference) may take time depending on your hardware.\n\n---\n## References\n\n### Official Tutorials\n- [TensorFlow: Basic Text Classification (Sentiment Analysis)](https://www.tensorflow.org/tutorials/keras/text_classification) \n- [TensorFlow: Classify Text with BERT](https://www.tensorflow.org/text/tutorials/classify_text_with_bert)\n- [TensorFlow Hub: Text Classification with Movie Reviews](https://www.tensorflow.org/hub/tutorials/tf2_text_classification)\n- [Hugging Face: Getting Started with Sentiment Analysis](https://huggingface.co/blog/sentiment-analysis-python)\n\n### Dataset\n\n- [Customer Sentiment Dataset on Kaggle](https://www.kaggle.com/datasets/kundanbedmutha/customer-sentiment-dataset)\n\n### Official Documentation\n\n- [Python Documentation](https://docs.python.org/3.13/contents.html)\n- [Pandas Documentation](https://pandas.pydata.org/docs/user_guide/index.html)\n- [Seaborn Documentation](https://seaborn.pydata.org/tutorial.html)\n- [Matplotlib Documentation](https://matplotlib.org/stable/users/index.html)\n- [Scikit-learn Documentation](https://scikit-learn.org/stable/user_guide.html)\n- [NLTK Documentation](https://www.nltk.org/)\n- [spaCy Documentation](https://spacy.io/usage)\n- [Transformers Documentation](https://huggingface.co/docs/transformers/index)\n- [Hugging Face Models](https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment)\n- [TextBlob Documentation](https://textblob.readthedocs.io/en/dev/quickstart.html#sentiment-analysis)\n- [VADER Sentiment Analysis](https://vadersentiment.readthedocs.io/en/latest/pages/features_and_updates.html)\n\n---\n## Acknowledgement\nWe would like to thank our Teacher for his guidance and support throughout\nthis project. The teaching materials and tutorials provided were instrumental\nin completing this work successfully.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamehinttech%2Fsentiment-analysis-customer-reviews","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsamehinttech%2Fsentiment-analysis-customer-reviews","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamehinttech%2Fsentiment-analysis-customer-reviews/lists"}