https://github.com/venkat-0706/twalyze
Twitter sentiment analysis project using machine learning to classify tweets and understand audience mood, opinions, and behavior trends in real-time.
https://github.com/venkat-0706/twalyze
logistic-regression machine-learning model-evaluation naive-bayes-classifier pandas python scikitlearn-machine-learning tfidf-vectorizer tokenization
Last synced: 5 days ago
JSON representation
Twitter sentiment analysis project using machine learning to classify tweets and understand audience mood, opinions, and behavior trends in real-time.
- Host: GitHub
- URL: https://github.com/venkat-0706/twalyze
- Owner: venkat-0706
- License: mit
- Created: 2025-04-05T04:09:59.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-15T10:59:38.000Z (12 months ago)
- Last Synced: 2025-10-26T08:10:50.480Z (7 months ago)
- Topics: logistic-regression, machine-learning, model-evaluation, naive-bayes-classifier, pandas, python, scikitlearn-machine-learning, tfidf-vectorizer, tokenization
- Language: Jupyter Notebook
- Homepage: https://colab.research.google.com/github/venkat-0706/Twalyze/blob/main/Twitter_Sentiment_Analysis_Using_Machine_Learning.ipynb
- Size: 24.4 KB
- Stars: 10
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Twalyze
Here is the final version of your **GitHub `README.md`** with both **training accuracy (80%)** and **testing accuracy (79%)** clearly mentioned:
---
````markdown
# ๐ฆ Twitter Sentiment Analysis Using Machine Learning
This project aims to analyze and classify sentiments expressed in tweets related to various airlines. It applies machine learning techniques to categorize each tweet as **Positive**, **Negative**, or **Neutral** based on its textual content.
---
## ๐ Objective
To build a machine learning model that classifies the sentiment of tweets using Natural Language Processing (NLP) and vectorization techniques.
---
## ๐ Dataset
The dataset used is `train.csv`, containing the following key columns:
- `tweet_id` โ Unique ID for each tweet
- `airline` โ The airline company mentioned
- `airline_sentiment` โ The sentiment label (positive, negative, neutral)
- `text` โ The content of the tweet
---
## ๐งฐ Libraries Used
- `pandas`, `numpy` โ Data manipulation
- `matplotlib`, `seaborn` โ Data visualization
- `nltk` โ Text preprocessing
- `sklearn` โ Machine learning models and evaluation
- `wordcloud` โ Word cloud visualization
---
## ๐ Workflow
### 1. **Data Preprocessing**
- Lowercasing text
- Removing URLs, mentions, hashtags, punctuations
- Removing stopwords
- Tokenization and Lemmatization (using NLTK)
### 2. **Exploratory Data Analysis**
- Visualizing sentiment distribution
- Airline-wise sentiment analysis
- Word clouds for each sentiment category
### 3. **Feature Extraction**
- TF-IDF Vectorization of cleaned text
### 4. **Model Training**
Trained the following classifiers:
- Logistic Regression
- Naive Bayes
- Random Forest
- Support Vector Machine (SVM)
> ๐ **Best Model Training Accuracy: ~80%**
> ๐งช **Best Model Testing Accuracy: ~79%**
### 5. **Model Evaluation**
Used the following metrics:
- Accuracy
- Confusion Matrix
- Classification Report
---
## โ
Results
The best-performing model achieved:
- **Training Accuracy:** ~80%
- **Testing Accuracy:** ~79%
These results indicate strong model performance with minimal overfitting.
---
## ๐ก Possible Improvements
- Integrate deep learning models like LSTM for better results
- Add real-time tweet scraping using Tweepy (Twitter API)
- Use Word2Vec or transformer-based embeddings (like BERT)
- Perform cross-validation for better model reliability
---
## ๐ Visualizations
- Word clouds for Positive, Negative, Neutral tweets
- Bar charts for sentiment distribution across airlines
- Confusion matrices for each ML model
---
## ๐ Getting Started
1. Clone the repo:
```bash
git clone https://github.com/venkat-0706/Twalyze.git
````
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Run the notebook:
Open the Colab notebook [here](https://colab.research.google.com/github/venkat-0706/Twalyze/blob/main/Twitter_Sentiment_Analysis_Using_Machine_Learning.ipynb)
---
## ๐ฌ Contact
Created by [@venkat-0706](https://github.com/venkat-0706)
Feel free to reach out for suggestions or collaborations!