{"id":18322795,"url":"https://github.com/justsecret123/twitter-sentiment-analysis","last_synced_at":"2026-01-03T18:49:10.121Z","repository":{"id":174787400,"uuid":"408943872","full_name":"Justsecret123/Twitter-sentiment-analysis","owner":"Justsecret123","description":"A sentiment analysis model trained with Kaggle GPU on 1.6M examples, used to make inferences on 220k tweets about Messi and draw insights from their results. ","archived":false,"fork":false,"pushed_at":"2022-05-22T10:21:17.000Z","size":25416,"stargazers_count":2,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-21T13:44:16.147Z","etag":null,"topics":["classification","data-analysis","data-science","deep-learning","deep-neural-networks","docker","glove-embeddings","kaggle","lstm","lstm-neural-networks","machine-learning","natural-language-processing","nlp","python","rnn","scikit-learn","sentiment-analysis","sentiment-classification","tensorflow","word-embeddings"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Justsecret123.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-09-21T19:08:46.000Z","updated_at":"2023-10-27T15:09:29.000Z","dependencies_parsed_at":"2023-08-31T03:15:14.572Z","dependency_job_id":null,"html_url":"https://github.com/Justsecret123/Twitter-sentiment-analysis","commit_stats":null,"previous_names":["justsecret123/twitter-sentiment-analysis"],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Justsecret123%2FTwitter-sentiment-analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Justsecret123%2FTwitter-sentiment-analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Justsecret123%2FTwitter-sentiment-analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Justsecret123%2FTwitter-sentiment-analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Justsecret123","download_url":"https://codeload.github.com/Justsecret123/Twitter-sentiment-analysis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247415783,"owners_count":20935383,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","data-analysis","data-science","deep-learning","deep-neural-networks","docker","glove-embeddings","kaggle","lstm","lstm-neural-networks","machine-learning","natural-language-processing","nlp","python","rnn","scikit-learn","sentiment-analysis","sentiment-classification","tensorflow","word-embeddings"],"created_at":"2024-11-05T18:26:00.115Z","updated_at":"2026-01-03T18:49:10.113Z","avatar_url":"https://github.com/Justsecret123.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Twitter-sentiment-analysis ![Language_support](https://img.shields.io/pypi/pyversions/Tensorflow) ![Last_commit](https://img.shields.io/github/last-commit/JustSecret123/Human-pose-estimation) ![Workflow](https://img.shields.io/github/workflow/status/JustSecret123/Human-pose-estimation/Pylint/main) ![Tensorflow_version](https://img.shields.io/badge/Tensorflow%20version-2.6.2-orange)\n\nA sentiment analysis model trained using a Kaggle GPU. Sentiment140 Dataset, with 1.6 million tweets.  \n\n\u003e **Deployed on my personal Docker Hub repository: [*Click here*](https://hub.docker.com/repository/docker/ibrahimserouis/my-tensorflow-models)\n\n\u003e **Kaggle Notebook link:  [Kaggle notebook](https://www.kaggle.com/ibrahimserouis99/twitter-sentiment-analysis)\n\n\u003ca href=\"https://www.linkedin.com/in/ibrahim-serouis-b05378181/\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/LinkedIn-Ibrahim%20Serouis-blue?link=http://left\u0026link=http://right)\"/\u003e\n\u003c/a\u003e\n\n# Dataset (Sentiment140+GloVe)\n\n- Train/test split : 90% / 10% \n- Size : 1.6M samples \n- Link : [Dataset](https://www.kaggle.com/ibrahimserouis99/twitter-sentiment-analysis-and-word-embeddings)\n\n\n# Model\n\n- Model type : Sequential, RNN, Binary classification\n- Optimizer : Adam\n- Loss function : Binary cross entropy \n- Outputs : Sentiment score [0;1]\n- Thresholds (fine-tuned):  \u003e=0.625 ---\u003e \"Positive\", \u003c0.625 ----\u003e \"Negative\"\n- Best validation accuracy : 83%\n- F1-score :  0.8340\n- Version : 4\n\n\n| Metric | Score |\n|--------|-------|\nPrecision|**Negative**: 0.84; **Positive:** 0.82   |\nRecall   |**Negative**: 0.82; **Positive:** 0.84 |\nF-1 score|**Negative**: 0.83; **Positive:** 0.83\n\n\n# Training \n\n- Training epochs : **initially** 50, but 22 with early stopping and a patience factor = 10\n- Training environment : Kaggle GPU\n\n\n## Architecture\n\n![Model_architecture](Screenshots/Model%20architecture.png)\n\n# Inferences (with Tensorflow Serving REST API)\n\n![Inference example](Screenshots/Inference%20example.PNG)\n\n# Some results using Power BI + Python\n\n## Positive tweets\n\n![Positives](Results/positive_messi.gif)\n\n## Negative tweets \n\n![Negatives](Results/negative_messi.gif)\n\n## Data by country (when available)\n\n![Country](Results/country_messi.gif)\n\n# Useful scripts and notebooks\n\n## Notebooks \n\n\u003e [Training notebook](Notebook/twitter-sentiment-analysis.ipynb)\n\n\u003e [How inferences were made on our dataset](Notebook/custom-nlp-classifier-on-football-tweets.ipynb)\n\n\u003e [Data cleaning notebook](Notebook/data-cleaning-messi-and-ronaldo-tweets.ipynb)\n\n\u003e [Data exploration notebook](Notebook/explore-tweets-about-messi-and-ronaldo.ipynb)\n\n## Scripts\n\n\u003e [Link to the Tensorflow Sevring script](Scripts/test_the_model.py)\n\n\u003e **There's also a useful script (command line runner) that converts .h5 models to TF SavedModel format [here](Scripts/h5_to_savedmodel.py)\n\u003e ![Args](Screenshots/clr_args.PNG)\n\n# Data collection (tweets about Messi and Ronaldo)\n\n- Collected using the Twitter API \n- Scripts for searching and saving 100*n tweets containing a keyword : [Tweets about Messi](Scripts/search_n_times_100_messi_tweets.py) \u0026 [Tweets about Ronaldo](Scripts/search_n_times_100_ronaldo_tweets.py)\n\n\u003e **NOTE: Executing these scripts requires a developer account, as well as a bearer_token stored into a text file whose path is manually given into the code, or exported as an environment variable**\n\n# Libraries\n\n- **Deep Learning Framework :** Tensorflow 2.6 or higher \n- **Data visualization :** Pandas, Seaborn, Matplotlib\n- **Regular expressions builder :** re \n- **NLP library :** NLTK\n- **Train/test splitting, classification_report :** Scikit-learn\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjustsecret123%2Ftwitter-sentiment-analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjustsecret123%2Ftwitter-sentiment-analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjustsecret123%2Ftwitter-sentiment-analysis/lists"}