{"id":15103622,"url":"https://github.com/atharvapathak/twitter_sentiment_analysis_project","last_synced_at":"2026-01-28T12:33:50.914Z","repository":{"id":232404720,"uuid":"784270818","full_name":"atharvapathak/Twitter_Sentiment_Analysis_Project","owner":"atharvapathak","description":" Twitter sentiment analysis is the process of analyzing tweets posted on the Twitter platform to determine the overall sentiment expressed within them. It involves using natural language processing (NLP) and machine learning techniques to classify tweets.","archived":false,"fork":false,"pushed_at":"2024-04-09T14:34:35.000Z","size":21847,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-05T12:09:40.292Z","etag":null,"topics":["api","bag-of-words","bert","cnn","data","gbm","nltk","rnn","spacy","twitter"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/atharvapathak.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-09T14:20:14.000Z","updated_at":"2024-04-09T14:38:31.000Z","dependencies_parsed_at":"2024-05-01T00:34:43.690Z","dependency_job_id":null,"html_url":"https://github.com/atharvapathak/Twitter_Sentiment_Analysis_Project","commit_stats":{"total_commits":6,"total_committers":1,"mean_commits":6.0,"dds":0.0,"last_synced_commit":"931400b845b8e1bb490877ec31983f146973a79b"},"previous_names":["atharvapathak/twitter_sentiment_analysis_project"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atharvapathak%2FTwitter_Sentiment_Analysis_Project","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atharvapathak%2FTwitter_Sentiment_Analysis_Project/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atharvapathak%2FTwitter_Sentiment_Analysis_Project/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atharvapathak%2FTwitter_Sentiment_Analysis_Project/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/atharvapathak","download_url":"https://codeload.github.com/atharvapathak/Twitter_Sentiment_Analysis_Project/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247332611,"owners_count":20921853,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","bag-of-words","bert","cnn","data","gbm","nltk","rnn","spacy","twitter"],"created_at":"2024-09-25T19:40:54.022Z","updated_at":"2026-01-28T12:33:49.162Z","avatar_url":"https://github.com/atharvapathak.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align = \"center\"\u003e\n\u003cimg src=\"https://github.com/Arsh2k01/UTrack/blob/main/UTrack.jpg\" width=\"650\" height=\"620\"\u003e \n \u003c/p\u003e \n\u003cbr /\u003e\n\n\n## 1. Technologies Used\n\n1. Tweepy API\n2. NLTK\n3. BERT Model\n4. Tensorflow\n6. Seaborn\n5. Streamlit\n\n## 2. Project Description\n### 2.1 Data Extraction and Preprocessing\nWe scraped data for each illness using the Tweepy API, based on keywords and phrases for each category.\nAdditionally, we scraped tweets that didn't contain these keywords. This data acted as the ‘neutral’ data.\nThe data was cleaned using libraries like regex, NLTK. Links, emojis, emoticons, and symbols were removed. \n\n### 2.2 DL Model\nWe explored Transformer models and found that BERT(Bidirectional Encoder Representations from Transformers) was better-suited for sentiment analysis. We used a pretrained BERT model and fine-tuned it on our training data. We trained a model for each class. \u003cbr /\u003e\nThe output given by the final layer was not fed to any activation function; it was instead given as input to a custom function to normalize and standardize the data. The function is given below: \u003cbr /\u003e\n\u003cbr /\u003e\n\u003cp align = \"center\"\u003e\n\u003cimg src=\"https://github.com/Arsh2k01/UTrack/blob/main/function.jpeg\" width=\"600\" height=\"150\"\u003e \n \u003c/p\u003e \n \u003cbr /\u003e\n\n### 2.3 Visualisation and Deployment\nWe used Seaborn to display the caculated level of Loneliness, Stress, and Anxiety for each user across time, thus enabling us to see how the user's mental state varied over time. Moreover, we estimate the weighted average for each category, over previous tweets **`[0:LOW,1:HIGH]`**.\nAdditonally, you can also view each specific tweet and its scores.\nDeployment was done using Streamlit. \n\n## 3. Files\n* **`Cleaning Tweets.py`** - Script to clean scraped tweets\n* **`Extracting Targeted Tweets.py`** - Script to scrape a user's Twitter information\n* **`Streamlit Deployment.py`** - Script to deploy the project\n* **`Streamlit Deployment.ipynb`** - Jupyter Notebook to deploy the project\n* **Extracted Tweets** - Training Data\n* **Training Models:**\n   * **`Anxiety Model.py`**\n   * **`Lonely Model.py`**\n   * **`Stress Model.py`**\n\n\n## 4. References\n* [Bidirectional Encoder Representations from Transformers (BERT): A sentiment analysis odyssey](https://arxiv.org/abs/2007.01127)\n* [Studying expressions of loneliness in individuals using twitter: an observational study](https://bmjopen.bmj.com/content/bmjopen/9/11/e030355.full.pdf)\n* [Understanding and Measuring Psychological Stress Using Social Media](https://static1.squarespace.com/static/53d29678e4b04e06965e9423/t/5ea0bea583b33b7bb006e140/1587592872890/2019UnderstandingStress.pdf)\n\n## 5. License\n[MIT](https://choosealicense.com/licenses/mit/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fatharvapathak%2Ftwitter_sentiment_analysis_project","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fatharvapathak%2Ftwitter_sentiment_analysis_project","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fatharvapathak%2Ftwitter_sentiment_analysis_project/lists"}