An open API service indexing awesome lists of open source software.

https://github.com/yash22222/bmc-product-reviews-web-scrapping-sentiment-analysis


https://github.com/yash22222/bmc-product-reviews-web-scrapping-sentiment-analysis

gssoc2025

Last synced: 28 days ago
JSON representation

Awesome Lists containing this project

README

          

# BMC Product Review Scrapping & Sentiment Analysis

This project performs **Web Scrapping** & **Sentiment Analysis** on verified Gartner reviews of popular **BMC Software Products**, using **Python NLP Techniques** and **Data Visualization**.

BMC Product Review Scrapping & Sentiment Analysis is an open source project designed for performing sentiment analysis on customer reviews of BMC Software products scraped from public platforms like Gartner. It leverages Natural Language Processing (NLP) techniques and visualization tools to extract actionable insights from product reviews.

This project is perfect for beginners and intermediate contributors who want hands-on experience with web scraping, NLP, data visualization, and open source collaboration.

It includes:
- Web scraping from [Gartner Peer Insights](https://www.gartner.com/reviews)
- Preprocessing text with NLP
- VADER-based sentiment scoring
- Charts, word clouds, and Excel exports

## ๐ŸŒ Products Covered

We scrape verified reviews from the following Gartner pages:

| Product Name | Review Page |
|--------------|-------------|
| ๐Ÿง  BMC Helix ITSM | [Link](https://www.gartner.com/reviews/market/software-asset-management-tools/vendor/bmc/product/bmc-helix-itsm/reviews) |
| ๐Ÿ“ˆ BMC Helix Operations Management | [Link](https://www.gartner.com/reviews/market/aiops-platforms/vendor/bmc/product/bmc-helix-operations-management-with-aiops/reviews) |
| โš™๏ธ TrueSight Server Automation | [Link](https://www.gartner.com/reviews/market/integrated-systems/vendor/bmc/product/bmc-truesight-automation-for-servers/reviews) |
| ๐Ÿ“Š Control-M | [Link](https://www.gartner.com/reviews/market/service-orchestration-and-automation-platforms/vendor/bmc/product/bmc-control-m/reviews) |

---

## ๐Ÿ“ Output Format

Your final analysis should look like this (in Excel or CSV):

| Product Name | Review Title | Overall Rating | Industry | Function | Date | Other Vendors | Country | Pros | Cons | Overall Comment | Sentiment |
|--------------|--------------|----------------|----------|----------|------|----------------|---------|------|------|------------------|-----------|

Visuals like pie charts and word clouds should be stored in the `outputs/` folder.

---

## ๐Ÿ“ฆ Example Directory Structure
```bash
BMC-Product-Review-Scrapping-and-Sentiment-Analysis/
โ”‚
โ”œโ”€โ”€ ๐Ÿ“‚ data/ # Sample scraped data files (Excel/CSV)
โ”œโ”€โ”€ ๐Ÿ“‚ notebooks/ # Jupyter notebooks for quick experimentation
โ”œโ”€โ”€ ๐Ÿ“‚ scripts/
โ”‚ โ”œโ”€โ”€ scraper.py # Scraper module
โ”‚ โ”œโ”€โ”€ nlp_preprocessing.py # Text cleaning + POS + lemmatization
โ”‚ โ”œโ”€โ”€ sentiment.py # VADER-based sentiment scoring
โ”‚ โ””โ”€โ”€ visualize.py # Wordclouds, pie charts, bar graphs
โ”‚
โ”œโ”€โ”€ ๐Ÿ“‚ outputs/ # Saved images, processed files
โ”‚
โ”œโ”€โ”€ requirements.txt # Install dependencies
โ”œโ”€โ”€ README.md # Project overview
โ”œโ”€โ”€ CONTRIBUTING.md # Contribution guidelines
โ”œโ”€โ”€ LICENSE # Open-source license
โ””โ”€โ”€ .gitignore
```

---

### ๐Ÿง  IMP Features

1. Robust product review scraper for BMC products
2. Clean text with:-
Tokenization
Lemmatization
POS Tagging
Stopword Removal
3. Sentiment classification using VADER
4. Generate sentiment reports and dashboards
5. Modularized structure for easy expansion and contributions
6. Export analysis to Excel and visual graphs

---

## ๐Ÿš€ Tech Stack

- **Python 3.x**
- **Selenium / Playwright** (for scraping)
- **NLTK, VADER** (for sentiment)
- **Pandas, Matplotlib, WordCloud**
- **Excel output (xlsxwriter/openpyxl)**
- **Any**
---

## ๐Ÿ› ๏ธ Getting Started

### ๐Ÿ”ง Installation

```bash
git clone https://github.com/Yash22222/BMC-Product-Review-Scrapping-and-Sentiment-Analysis.git
cd BMC-Product-Review-Scrapping-and-Sentiment-Analysis
pip install -r requirements.txt
````

### ๐Ÿ“Š Run Sentiment Analysis

1. Scrape reviews using the `scraper.py` script.
2. Clean and preprocess with `nlp_preprocessing.py`.
3. Analyze sentiment using `sentiment.py`.
4. Visualize using `visualize.py`.

---

## ๐Ÿค How to Contribute (for GSSoC'25)

We welcome contributions from **GSSoC contributors and all open source enthusiasts**!

### ๐Ÿ” Steps to Contribute

1. **Fork** the repository
2. **Clone** your fork

```bash
git clone https://github.com/YOUR_USERNAME/BMC-Product-Review-Scrapping-and-Sentiment-Analysis.git
```
3. Commit your changes

```bash
git commit -m "โœจ Added sentiment model for XYZ"
```
4. Push to your fork

```bash
git push origin feature/your-feature-name
```
6. Open a **Pull Request** with a clear explanation.

## ๐Ÿง  Contribution Ideas

| Type | Ideas |
| ----------------------------- | --------------------------------------- |
| ๐Ÿ”„ Add new BMC products | Expand the scraper |
| ๐ŸŽจ Streamlit UI | Upload reviews & analyze sentiment |
| ๐Ÿงพ PDF/Excel report generator | Auto reports for each product |
| ๐Ÿค– Add BERT | Use HuggingFace transformer models |
| ๐ŸŒ Multi-language support | Translate & analyze non-English reviews |
| ๐Ÿ›  Docker Support | Add Dockerfile for easy setup |

---

## ๐Ÿ“œ License

This project is licensed under the **MIT License**.

---

## ๐Ÿ™Œ Credits

* Proudly open for contributions under GSSoC 2025

```