{"id":21917752,"url":"https://github.com/dylanjcastillo/twitter-sentiment-tracker","last_synced_at":"2025-04-19T10:55:02.317Z","repository":{"id":37636224,"uuid":"279977162","full_name":"dylanjcastillo/twitter-sentiment-tracker","owner":"dylanjcastillo","description":"Dash app for classifying tweets in real-time","archived":false,"fork":false,"pushed_at":"2023-06-12T21:28:54.000Z","size":1764,"stargazers_count":66,"open_issues_count":2,"forks_count":12,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-29T07:11:29.146Z","etag":null,"topics":["dash","nlp","python3","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dylanjcastillo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-07-15T20:58:06.000Z","updated_at":"2025-01-19T09:12:44.000Z","dependencies_parsed_at":"2024-11-28T19:41:47.386Z","dependency_job_id":"2dd356c1-30ac-4f78-bd86-d6a11620edca","html_url":"https://github.com/dylanjcastillo/twitter-sentiment-tracker","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dylanjcastillo%2Ftwitter-sentiment-tracker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dylanjcastillo%2Ftwitter-sentiment-tracker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dylanjcastillo%2Ftwitter-sentiment-tracker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dylanjcastillo%2Ftwitter-sentiment-tracker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dylanjcastillo","download_url":"https://codeload.github.com/dylanjcastillo/twitter-sentiment-tracker/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249678401,"owners_count":21309728,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dash","nlp","python3","pytorch"],"created_at":"2024-11-28T19:41:39.686Z","updated_at":"2025-04-19T10:55:02.285Z","avatar_url":"https://github.com/dylanjcastillo.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Twitter Sentiment Tracker\n\n![Python](https://img.shields.io/badge/Python-v3.8.3-brightgreen) ![License](https://img.shields.io/badge/license-MIT-blue) ![contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat) [![Twitter](https://img.shields.io/twitter/url/https/twitter.com/_dylancastillo.svg?style=social\u0026label=Follow%20%40_dylancastillo)](https://twitter.com/_dylancastillo)\n\nThis repository contains the source code and resources for building a web app that tracks the sentiment on twitter towards a set of pre-specified accounts. Essentially, it is a leaner version of [polituits.com](http://polituits.com).\n\nThe end-product looks as follows:\n\n\u003cimg src=\"tweets_scorer.gif\" alt=\"architecture\" width=\"600\"\u003e\n\n# How It Works\n\nThe application provides you with (close to) real-time **tracking of the sentiment on Twitter towards a set of accounts**. For predicting the sentiment of comments it uses a combination of fixed rules and a classifier built using the language model [BERT](\u003chttps://en.wikipedia.org/wiki/BERT_(language_model)\u003e).\n\nTo make it clearer, the app doesn't track the sentiment of the comments made by the accounts you defined. **It tracks the sentiment of the responses and mentions received by those accounts.**\n\nHere's the architecture of the app:\n\n\u003cimg src=\"architecture.png\" alt=\"architecture\" width=\"600\"\u003e\n\nIt comprises the following elements:\n\n- **NGINX** as a reverse proxy server\n- **Gunicorn** as a WSGI server\n- A **Dash** application for visualizing results\n- An **SQlite3** database to store processed tweets\n- Two additional **services** for getting, processing, and assigning sentiment to tweets\n\n# How to Add Accounts to Track\n\nIf you want to build your own app to track sentiment toward a specific set of accounts, there are three things you need to do: set the required environment variables, define which accounts you want to track, and configure your sentiment classifier model.\n\n## Set Environment Variables\n\nStart by setting up a [Twitter Developer account](https://developer.twitter.com/en), create an App, and generate consumer keys for your App. In case you're wondering, this is entirely free. You just need to fill out some questions.\n\nThen, create an `.env` file in the `fetcher/` directory. It should contain these variables:\n\n```\nTWITTER_KEY=COPY_YOUR_API_KEY_HERE\nTWITTER_SECRET=COPY_YOUR_API_SECRET_KEY_HERE\nSENTIMENT_APP_HOST=sentiment_app\nFETCH_INTERVAL=30\nLANGUAGE=es\n```\n\nYou can get the values for `TWITTER_KEY` and `TWITTER_SECRET` from your App's details in your developer account:\n\n\u003cimg src=\"twitter_api_keys.png\" alt=\"architecture\" width=\"600\"\u003e\n\nFor `SENTIMENT_APP_HOST` use `sentiment_app` if you are testing or deployig the app using _docker-compose_. If you are doing tests in a python virtual environment in your _local machine_, put `localhost` there.\n\n`FETCH_INTERVAL` defines how frequently, in seconds, you make requests to the Twitter API to get the latest tweets. Make sure to read the [rate limits](https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets) you should respect. The general recommendation is not to have many accounts and not updating that\n\n## Define Accounts to Track\n\nTo define which accounts you want to track you need to update the `data/accounts.csv` file. This will feed a query to the Twitter API that gets the mentions and responses that those accounts get. There's some _smart filters_ to avoid getting mentions or responses that are note relevant.\n\nThe `accounts.csv` file has the following fields:\n\n- **id:** Identifier of the account (can be anything, just needs to be unique)\n- **account:** Twitter handle you are interested in tracking\n- **name:** Name that is displayed in the summary cards\n- **image:** Image displayed in the summary cards and the latest tweets section\n- **color:** Color associated with that account, it is shown at the top of the summary card\n- **party:** Political party associated with the account you want to track. Leave it empty if it isn't relevant.\n\n## Bring Your Own Model\n\nYou'll probably want to use a different model than the one I used. It shouldn't be that hard to add one. You need to provide the following things:\n\n- A vocabulary file (`vocab.txt`) for the Tokenizer\n- A pre-trained BERT model from [HugginFace's repository of models](https://huggingface.co/models)\n- The model's learned parameters to load using `load_state_dict()`\n- An updated `emojis_dict.csv` file in the `data/` directory, if you are planning on keeping that in the tweet processing pipeline\n\nFor training the model, I suggest the following [repository](https://github.com/abhishekkrthakur/bert-sentiment) and [tutorial](https://www.youtube.com/watch?v=hinZO--TEk4) by Abhishek Thakur.\n\nIf you plan on building a dataset for training your model, then use the same pre-processing steps as in the `process_text()` function in `fetch_tweets.py` . Adjust them if necessary.\n\nSave the `vocab.txt` and the model's learned parameters files in the `sentiment_app/input/` directory. Then, update the `config.py` file in `sentment_app/`:\n\n```\nMAX_LEN = 256\nPREDICT_BATCH_SIZE = 32\nNUM_WORKERS = 4\nMODEL_PATH = \"./input/model.bin\" # Replace by your file with the learned parameters\nBERT_MODEL = \"dccuchile/bert-base-spanish-wwm-uncased\" # Replace by pre-trained BERT from HugginFace's models\nTOKENIZER = transformers.BertTokenizerFast.from_pretrained(\n\"./input/\", do_lower_case=True, truncation=True\n)\n```\n\nRemember to replace the `emojis_dict.csv` in the `data/` directory by the version you are planning to use.\n\nYou can use the versions of the [vocabulary](https://drive.google.com/file/d/1soU3JKDnmAeJdBEP-JGqb1DCM-s0roqW/view?usp=sharing) and the [learned parameters](https://drive.google.com/file/d/1b9U903Sky6Rl81X0reIgnBrmJYeJQvDV/view?usp=sharing) I used. Save them as `vocab.txt` and `model.bin` in `sentiment_app/input/`. Keep the `config.py` file as is.\n\n# How to Deploy\n\nFirst, make sure you've set everything specified in the [previous section](#how-to-add-accounts-to-track).\n\nIn addition, there are a few things you need to have in place in your VPS:\n\n- [Python 3.8](https://gist.github.com/plembo/6bc141a150cff0369574ce0b0a92f5e7#file-deadsnakes-python38-ubuntu-bionic-md)\n- [Docker](https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-18-04) and [docker-compose](https://www.digitalocean.com/community/tutorials/how-to-install-docker-compose-on-ubuntu-18-04)\n\nLog in to your VPS, and install Python 3.8, Docker, and docker-compose. Then, continue as follows:\n\n1. Clone the repository: `git clone https://github.com/dylanjcastillo/twitter-sentiment-tracker.git`\n\n2. Open a terminal at the root directory of your project and create the Tweets database as follows:\n\n   ```shell\n   $ cd utils\n   $ python3.8 create_database.py\n   ```\n\n3. Create an `.env` file with the [required variables](#set-environment-variables)\n\n4. Update the `accounts.csv` file with [accounts you want to track](#define-accounts-to-track).\n\n5. Setup your [model](#bring-your-own-model)\n\n6. Add the server name to the NGINX configuration file in `nginx/project.conf`\n\n   ```conf\n\n   server {\n\n       listen 80;\n       server_name REPLACE_BY_DOMAIN_OR_IP;\n\n       charset utf-8;\n\n       location / {\n           proxy_pass http://dash_app:8050;\n\n           # Do not change this\n           proxy_set_header Host $host;\n           proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;\n           proxy_set_header X-Forwarded-Proto $scheme;\n           proxy_set_header X-Real-IP $remote_addr;\n       }\n\n   }\n\n   ```\n\n7. Run `sh run-docker-sh` in the your root directory of your project\n\n8. (Optional) You can add a [cronjob](https://medium.com/@gavinwiener/how-to-schedule-a-python-script-cron-job-dea6cbf69f4e) that executes the `clean_database.py` in the `utils/` folder everyday. This script removes old data from the database.\n\n9) That's all! It's ALIVE!\n\nYou can run the `stop-docker.sh` script to stop the Docker containers.\n\n# Limitations\n\n- The model I trained is not great and will only work for tweets in Spanish. If you want high-quality results, make sure to dedicate some time to building your model.\n- If you choose to track a very popular account, the application might not be able to do it at a proper pace. I do not paginate results from the API, so the fetcher will only get the last 100 results from whatever interval you decide to use.\n- The fetcher tries to get tweets in the interval you defined. However, if it takes too much time to process the tweets, it will not guarantee that it works with that frequency.\n- If you are planning on having an SSL certificate, then you'll to do some changes on the NGINX service.\n- For local development, I was using Python virtual environments. I only used Docker for deploying the application.\n- As usual, parts of the code come from tutorials, Stack Overflow questions, and blog posts. I did not keep track of these, but if you feel any attribution is due, shoot me a message.\n- Finally, this was just an experimental project I did for fun. I just added a couple of tests for the fetcher. So expect bugs.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdylanjcastillo%2Ftwitter-sentiment-tracker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdylanjcastillo%2Ftwitter-sentiment-tracker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdylanjcastillo%2Ftwitter-sentiment-tracker/lists"}