https://github.com/nemat-al/social-media-monitoring
Social Media Monitoring for Twitter in python language
https://github.com/nemat-al/social-media-monitoring
d2l jupyter-notebook natural-language-processing nlp python rnn scrapping sentiment-analysis twitter
Last synced: 8 months ago
JSON representation
Social Media Monitoring for Twitter in python language
- Host: GitHub
- URL: https://github.com/nemat-al/social-media-monitoring
- Owner: nemat-al
- Created: 2024-03-18T20:00:42.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-04-19T17:10:20.000Z (over 1 year ago)
- Last Synced: 2025-01-23T08:44:33.235Z (9 months ago)
- Topics: d2l, jupyter-notebook, natural-language-processing, nlp, python, rnn, scrapping, sentiment-analysis, twitter
- Language: Jupyter Notebook
- Homepage:
- Size: 8.47 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Social-Media-Monitoring
Social Media Monitoring /Twitter/ in python language.
---
## Index
1. [About the project](#about-the-project)
2. [Requirements](#requirements)
3. [Social Media Monitoring Steps](#semantic-search-engine-steps)
4. [Data Processing](#data-processing)
5. [Sentiment Analysis with RNN](#sentiment-analysis-with-rnn)
---
## About the project
The main goal of this project is to scrape data from Twitter for a specific brand or product and try to classify the tweets sentimatically as negative or positive tweets.
---
## Requirements
Here is a list of some of the used libraries:
1. d2l: for building the sentiment analysis model.
2. demoji: for replacing the emojis in tweets with their description.
3. snscrape: a library to scrap tweets.
4. Nltk: for natural language processing.
---
## Social Media Monitoring Steps
In the following figure we can see the main steps of the project:
First of all we should select a string to search for, then we need to scrape Twitter for posts containing that string.
After that, the data will be cleaned, analyzed and presented in a way to show what sentiments are included in the tweets.

---
## Data Processing
After scrapping data from Twitter. The following preprocessing pipeline will be apllied :
1. Cleaning signs:
- Replacing emojis with their description.
- Data collected the web and specially from social media platform contain links and signs like (#) and (@)., we remove those signs.
- It is important to delete the links, since no sentiments can be discovered from the link text.
- Deleting punctuation marks and any non-Asci chars.
2. NLP cleaning:
- We keep only English texts.
- Remove all the occurrences of the word that we have searched for.
- The names will be kept and other tokens will be transformed to their lemmas.
The Data processing pipeline is shown in the following figure:

---
## Sentiment Analysis with RNN
The idea is to represent each token using the pretrained GloVe model, and feed these token representations into a multilayer bidirectional RNN to obtain the
text sequence representation, which will be transformed into sentiment analysis outputs.
The sentiment analysis model is inspired from: https://d2l.ai/chapter_natural-language-processing-applications/sentiment-analysis-rnn.html