https://github.com/supriya811106/twitter-sentiment-analysis

Analyzing the mood of tweets! We sort tweets on popular topics into positive, negative, or neutral categories to gauge public opinion. See what Twitter really thinks!
https://github.com/supriya811106/twitter-sentiment-analysis

bernoulli-naive-bayes jupyter-notebook matplotlib nlp-machine-learning nltk numpy pandas python scikit-learn seaborn sentiment-analysis text-classification tfidf-vectorizer wordcloud

Last synced: 3 months ago
JSON representation

Analyzing the mood of tweets! We sort tweets on popular topics into positive, negative, or neutral categories to gauge public opinion. See what Twitter really thinks!

Host: GitHub
URL: https://github.com/supriya811106/twitter-sentiment-analysis
Owner: supriya811106
Created: 2023-08-08T19:14:29.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2023-09-25T12:34:51.000Z (almost 3 years ago)
Last Synced: 2025-05-21T17:11:36.037Z (about 1 year ago)
Topics: bernoulli-naive-bayes, jupyter-notebook, matplotlib, nlp-machine-learning, nltk, numpy, pandas, python, scikit-learn, seaborn, sentiment-analysis, text-classification, tfidf-vectorizer, wordcloud
Language: Jupyter Notebook
Homepage:
Size: 2.13 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Twitter Sentiment Analysis

This project performs sentiment analysis on Twitter data using machine learning techniques. It includes data preprocessing, feature extraction, and classification to predict whether a tweet has a positive or negative sentiment.

## Table of Contents

- [Introduction](#introduction)
- [Data](#data)
- [Preprocessing](#preprocessing)
- [Feature Extraction](#feature-extraction)
- [Model Training and Evaluation](#model-training-and-evaluation)
- [Usage](#usage)
- [Dependencies](#dependencies)

## Introduction

Sentiment analysis, also known as opinion mining, is the process of determining the sentiment or emotional tone expressed in a piece of text, such as a tweet. In this project, we analyze tweets from Twitter to classify them as either positive or negative sentiment.

## Data

We used a Twitter dataset containing 1.6 million tweets with labels indicating their sentiment (0 for negative and 1 for positive). The dataset includes various features, including tweet text, timestamp, and user information.

## Preprocessing

- Removed URLs from tweet text.
- Removed punctuation marks from tweet text.
- Removed stopwords (common words like "the," "and," "is") from tweet text.
- Cleaned repeated characters (e.g., "loooove" becomes "love").
- Removed numbers from tweet text.
- Removed rare words that do not contribute significantly to sentiment analysis.
- Removed special characters and extra white spaces.
- Tokenized the text and performed stemming and lemmatization.

## Feature Extraction

We used the TF-IDF (Term Frequency-Inverse Document Frequency) vectorization technique to convert the preprocessed text into numerical features. This technique helps in creating feature vectors for machine learning models.

## Model Training and Evaluation

We trained a Bernoulli Naive Bayes (BNB) classifier using the TF-IDF vectors as features. The model was evaluated using classification metrics and a ROC curve. The confusion matrix and ROC curve are visualized in the README for performance assessment.

## Usage

1. Clone this repository:
```bash
git clone https://github.com/supriya811106/Twitter-Sentiment-Analysis.git
```

3. Install the required dependencies (see the Dependencies section).

4. Run the Jupyter Notebook or Python script to preprocess the data, extract features, train the model, and perform sentiment analysis on new tweets.

5. Modify the code as needed for your specific use case.

## Dependencies

- Python 3.x
- Jupyter Notebook (optional, for running the project interactively)
- Pandas
- Numpy
- Scikit-learn
- Matplotlib
- Seaborn
- NLTK (Natural Language Toolkit)
- WordCloud
- RegexpTokenizer
- TfidfVectorizer
- BernoulliNB

You can install these dependencies using pip:
```bash
pip install pandas numpy scikit-learn matplotlib seaborn nltk wordcloud
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/supriya811106/twitter-sentiment-analysis

Awesome Lists containing this project

README