https://github.com/zvdy/sentiment_analysis
Sentiment Analysis Notebook based on tweets
https://github.com/zvdy/sentiment_analysis
Last synced: 11 months ago
JSON representation
Sentiment Analysis Notebook based on tweets
- Host: GitHub
- URL: https://github.com/zvdy/sentiment_analysis
- Owner: zvdy
- License: mit
- Created: 2023-01-14T15:15:38.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-01-14T15:28:39.000Z (over 3 years ago)
- Last Synced: 2025-03-30T06:33:27.315Z (about 1 year ago)
- Language: Jupyter Notebook
- Size: 15.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Sentiment Analysis on Covid-19 Tweets
This is a project to analyze the sentiment of tweets about Covid-19. The project is done in Python using [gabrielpreda](https://github.com/gabrielpreda) dataset. The dataset contains 14,000 tweets about Covid-19. The dataset is available on Kaggle [here](https://www.kaggle.com/gpreda/all-covid19-tweets). The dataset is also available on [Github](https://raw.githubusercontent.com/gabrielpreda/covid-19-tweets/master/covid19_tweets.csv). The dataset contains the following columns:
* user_name: The name of the user
* user_location: The location of the user
* user_description: The description of the user
* user_created: The date the user created their account
* user_followers: The number of followers the user has
* user_friends: The number of friends the user has
* user_favourites: The number of tweets the user has liked in the account
* user_verified: Whether the user is verified or not
* date: The date the tweet was created
* text: The text of the tweet
* hashtags: The hashtags in the tweet
* source: The app used to post the tweet
* is_retweet: Whether the tweet is a retweet or not
## Libraries Used
* Pandas
* Numpy
* Matplotlib
* Seaborn
* Plotly
* String
* Collections
* NLTK
* Sklearn
* Re
## Data Cleaning
* Removing the rows with missing values
* Removing the rows with duplicate values
* Removing the rows with retweets
* Removing the rows with tweets that are not in English
## LICENSE
[MIT License](LICENSE)