Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/cack195/twitter-sentiment-analysis

This project implements a machine learning model for sentiment analysis on Twitter data. It uses NLTK and Python to preprocess, extract features, and classify tweets as positive, negative, or neutral. The analysis visualizes word frequencies and sentiment distributions to understand public opinion on topics.
https://github.com/cack195/twitter-sentiment-analysis

Last synced: 8 days ago
JSON representation

Host: GitHub
URL: https://github.com/cack195/twitter-sentiment-analysis
Owner: cack195
Created: 2024-09-09T16:36:26.000Z (5 months ago)
Default Branch: main
Last Pushed: 2024-09-09T19:50:36.000Z (5 months ago)
Last Synced: 2024-11-15T14:19:23.291Z (2 months ago)
Language: Jupyter Notebook
Size: 4.08 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Twitter-Sentiment-Analysis
This project implements a machine learning model for sentiment analysis on Twitter data. It uses NLTK and Python to preprocess, extract features, and classify tweets as positive, negative, or neutral. The analysis visualizes word frequencies and sentiment distributions to understand public opinion on topics.

# Libraries used
- **Python**: For scripting and data manipulation.
- **Pandas**: For data processing and CSV file operations.
- **NLTK**: Used for natural language processing including stopwords management and feature extraction.
- **Scikit-Learn**: For creating training and testing splits and building the Naive Bayes classifier.
- **Matplotlib** and **Seaborn**: For data visualization, including bar plots and heatmaps.
- **WordCloud**: To generate visual representations of word frequency.

# Key Features:
- **Data Preprocessing**: Cleansing tweets by removing URLs, mentions, hashtags, and stopwords.
- **Sentiment Analysis Model**: Utilizing Naive Bayes for sentiment classification.
- **Visualization**: Generating word clouds, confusion matrices, and frequency bar plots to illustrate the results.
- **Performance Metrics**: Evaluating the model with precision, recall, and F1-scores to understand its efficacy in real-world scenarios.