https://github.com/r-mahesh45/text-mining-assignment
This project performs sentiment analysis on Elon Musk's tweets and emotion mining on product reviews from an e-commerce website. It involves data preprocessing techniques such as stemming, lemmatization, and removing stop words. The goal is to extract meaningful insights and classify text based on sentiment and emotion.
https://github.com/r-mahesh45/text-mining-assignment
extract-transform-load lemmatization nltk-python python3 text-mining
Last synced: about 2 months ago
JSON representation
This project performs sentiment analysis on Elon Musk's tweets and emotion mining on product reviews from an e-commerce website. It involves data preprocessing techniques such as stemming, lemmatization, and removing stop words. The goal is to extract meaningful insights and classify text based on sentiment and emotion.
- Host: GitHub
- URL: https://github.com/r-mahesh45/text-mining-assignment
- Owner: R-Mahesh45
- Created: 2024-03-07T10:12:34.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-01-04T14:26:10.000Z (over 1 year ago)
- Last Synced: 2025-01-30T07:16:10.601Z (over 1 year ago)
- Topics: extract-transform-load, lemmatization, nltk-python, python3, text-mining
- Language: Jupyter Notebook
- Homepage:
- Size: 810 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Text Mining Assignment
This repository contains two text mining projects:
### **1. Sentiment Analysis on Elon Musk's Tweets**
- Perform sentiment analysis on a dataset of tweets by Elon Musk (`Elon-musk.csv`).
- The goal is to classify the tweets as positive, negative, or neutral based on the text content.
### **2. Emotion Mining on Product Reviews**
- Extract product reviews from an e-commerce website (e.g., Amazon).
- Perform emotion mining by processing the text to remove noise such as emojis, punctuation, and stop words.
- Use Natural Language Processing (NLP) techniques like stemming and lemmatization to extract meaningful insights from the reviews.
---
## **Project Structure**
```
├── data
│ ├── Elon-musk.csv
│ └── reviews.csv
├── scripts
│ ├── sentiment_analysis.py
│ └── emotion_mining.py
├── README.md
└── requirements.txt
```
---
## **Getting Started**
### **Prerequisites**
To run this project, you need to have the following libraries installed:
- `nltk`
- `pandas`
- `re`
- `sklearn`
You can install the necessary dependencies by running:
```
pip install -r requirements.txt
```
### **Sentiment Analysis (Elon Musk Tweets)**
1. Load the `Elon-musk.csv` file containing the tweets.
2. Perform data preprocessing (remove stop words, tokenize, etc.).
3. Apply a sentiment analysis model to classify each tweet as positive, negative, or neutral.
4. Evaluate the model performance using classification metrics.
### **Emotion Mining (Product Reviews)**
1. Scrape or load product reviews data (`reviews.csv`).
2. Preprocess the text by removing punctuation, emojis, and stop words.
3. Apply NLP techniques like stemming and lemmatization to the reviews.
4. Extract emotions and classify the reviews based on sentiment or emotion.
---
## **How to Use**
### **Sentiment Analysis**
Run the sentiment analysis script:
```bash
python scripts/sentiment_analysis.py
```
This script will process the `Elon-musk.csv` file and classify each tweet's sentiment.
### **Emotion Mining**
Run the emotion mining script:
```bash
python scripts/emotion_mining.py
```
This script will process product reviews from an e-commerce website and classify emotions in the reviews.
---
## **Results**
- **Sentiment Analysis**: The sentiment classification for each tweet will be printed with its corresponding sentiment label (positive, negative, or neutral).
- **Emotion Mining**: The emotional sentiment of each review will be extracted and displayed.
---
## **License**
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
---
## **Acknowledgements**
- Natural Language Toolkit (NLTK) for text preprocessing and sentiment analysis.
- Scikit-learn for machine learning models and evaluation.
- Pandas for data manipulation.