{"id":20535584,"url":"https://github.com/coskundeniz/twitter-data-extractor","last_synced_at":"2025-03-06T03:25:49.921Z","repository":{"id":171274053,"uuid":"484058670","full_name":"coskundeniz/twitter-data-extractor","owner":"coskundeniz","description":"Twitter Data Extractor","archived":false,"fork":false,"pushed_at":"2023-06-06T09:15:22.000Z","size":18409,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-28T19:17:45.505Z","etag":null,"topics":["data-extraction","excel","google-sheets","mongodb","python","sqlite","tweepy","tweets-extraction","twitter","twitter-api"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/coskundeniz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":null,"patreon":"pythondoctor","open_collective":null,"ko_fi":"coskundeniz","tidelift":null,"community_bridge":null,"liberapay":"coskundeniz","issuehunt":null,"otechie":null,"custom":["https://www.buymeacoffee.com/coskundeniz"]}},"created_at":"2022-04-21T13:17:30.000Z","updated_at":"2025-02-22T23:48:53.000Z","dependencies_parsed_at":null,"dependency_job_id":"53e2f67c-0f13-4234-aa96-91ce4c9ae80a","html_url":"https://github.com/coskundeniz/twitter-data-extractor","commit_stats":null,"previous_names":["coskundeniz/twitter-data-extractor"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coskundeniz%2Ftwitter-data-extractor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coskundeniz%2Ftwitter-data-extractor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coskundeniz%2Ftwitter-data-extractor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coskundeniz%2Ftwitter-data-extractor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/coskundeniz","download_url":"https://codeload.github.com/coskundeniz/twitter-data-extractor/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242140605,"owners_count":20078363,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-extraction","excel","google-sheets","mongodb","python","sqlite","tweepy","tweets-extraction","twitter","twitter-api"],"created_at":"2024-11-16T00:32:48.968Z","updated_at":"2025-03-06T03:25:49.895Z","avatar_url":"https://github.com/coskundeniz.png","language":"Python","funding_links":["https://patreon.com/pythondoctor","https://ko-fi.com/coskundeniz","https://liberapay.com/coskundeniz","https://www.buymeacoffee.com/coskundeniz"],"categories":[],"sub_categories":[],"readme":"Twitter Data Extractor\n======================\n\nThis command-line tool extracts user and tweet data from Twitter and reports the results to CSV, Excel, Google Sheets documents or MongoDB, SQLite databases.\n\n[Related post on Medium](https://medium.com/@codenineeight/designing-a-twitter-data-extractor-tool-using-python-part-1-intro-50cd1c6fcb2e)\n\n### Supported Features\n\n* Extract single/multiple user data.\n* Extract user’s friends/followers data.\n* Extract tweets data for a user.\n* Extract tweets data for a search keyword.\n* Report results to CSV, Excel or Google Sheets documents.\n* Report results to MongoDB or SQLite databases.\n\n**Fields to extract for user data**\n* User ID\n* Username\n* Name\n* Account creation date\n* Bio\n* URLs, Hashtags, Mentions\n* Location\n* Pinned Tweet ID\n* Pinned Tweet\n* Profile image URL\n* Account protected flag\n* Public metrics (followers/following/tweet/listed counts)\n* External URL\n* Verified flag\n\n**Fields to extract for tweet data**\n* Tweet ID\n* Tweet text\n* Tweet creation date\n* Source\n* Language\n* Public metrics (retweet/reply/like/quote count)\n* URLs, Hashtags, Mentions\n* Media (key, type, url, duration_ms(for video), width, height, public_metrics)\n* Place (ID, full name, country, country code, place type, geo coordinates)\n* Author data (for search tweets)\n\nYou can see the user manual [here](https://github.com/coskundeniz/twitter-data-extractor/blob/main/docs/user_manual.pdf).\n\n## How to setup\n\n* Run the following commands to install required packages in the project directory.\n\n    * `python -m venv env`\n    * `source env/bin/activate`\n    * `python -m pip install -r requirements.txt`\n\n\n### Setting Environment Variables\n\nFor using the Twitter API service, set the `TWITTER_BEARER_TOKEN_CODE` environment variable with your bearer token value. Set the `TWITTER_CONSUMER_KEY_CODE` and `TWITTER_CONSUMER_SECRET_CODE` environment variables for your consumer key and consumer secret tokens to use the tool on behalf of another user account.\n\nYou can see the instructions to set environment variables [here for Linux](https://phoenixnap.com/kb/linux-set-environment-variable), [here for Windows](https://phoenixnap.com/kb/windows-set-environment-variable), and [here for Mac](https://phoenixnap.com/kb/set-environment-variable-mac).\n\n### MongoDB Installation\n\nIf you will use MongoDB to save users/tweets data, install it from [here](https://docs.mongodb.com/manual/administration/install-community/).\n\nYou can check the running status after installation and start the database server with the following commands on Linux.\n\n* `sudo service mongod status`\n* `sudo service mongod start`\n\n\n## How to use\n\n```sh\nusage: python twitter_data_extractor.py [-h] [-c] [-cf CONFIGFILE] [--forme] [-u USER] [-ul USERS] [-fr] [-fl] [-ut] [-s SEARCH]\n                                        [-tc TWEET_COUNT] [-e EXCLUDES] [-ot OUTPUT_TYPE] [-of OUTPUT_FILE] [-sm SHARE_MAIL]\n\noptional arguments:\n  -h, --help                                  show this help message and exit\n  -c, --useconfig                             Read configuration from config.json file\n  -cf CONFIGFILE, --configfile CONFIGFILE     Read configuration from given file\n  --forme                                     Determine API user(account owner or on behalf of a user)\n  -u USER, --user USER                        Extract user data for the given username\n  -ul USERS, --users USERS                    Extract user data for the given comma separated usernames\n  -fr, --friends                              Extract friends data for the given username\n  -fl, --followers                            Extract followers data for the given username\n  -ut, --user_tweets                          Extract tweets of user with the given username\n  -s SEARCH, --search SEARCH                  Extract latest tweets for the given search keyword\n  -tc TWEET_COUNT, --tweet_count TWEET_COUNT  Limit the number of tweets gathered\n  -e EXCLUDES, --excludes EXCLUDES            Fields to exclude from tweets queried as comma separated values (replies,retweets)\n  -ot OUTPUT_TYPE, --output_type OUTPUT_TYPE  Output file type (csv, xlsx, gsheets, mongodb or sqlite)\n  -of OUTPUT_FILE, --output_file OUTPUT_FILE  Output file name\n  -sm SHARE_MAIL, --share_mail SHARE_MAIL     Mail address to share Google Sheets document\n```\n\n* If config will be used for getting parameters, boolean parameters like --forme, --friends, --followers, --user_tweets still must be passed as command-line option.\n* \"user\" and \"users\" field should be empty for \"search\" keyword to be used.\n\nThe following is an example of config.json content.\n\n```json\n{\n    \"user\": \"gvanrossum\",\n    \"users\": \"\",\n    \"search\": \"\",\n    \"excludes\": \"retweets\",\n    \"tweet_count\": 20,\n    \"output_type\": \"xlsx\",\n    \"output_file\": \"results.xlsx\",\n    \"share_mail\": \"codenineeight@gmail.com\"\n}\n```\n\n### Basic Usage\n\nThe following commands are a few examples of getting user data, user’s friends, tweets or tweets of a given keyword.\n\n* `python twitter_data_extractor.py -u gvanrossum`\n* `python twitter_data_extractor.py --forme -ul \"gvanrossum,nedbat\"`\n* `python twitter_data_extractor.py -u gvanrossum -fr`\n* `python twitter_data_extractor.py --forme -u gvanrossum -ut`\n* `python twitter_data_extractor.py -s python`\n\nResults are written to *results.xlsx* file by default.\nLogs can be seen in the *tw_data_extractor.log* file in the project directory.\n\n\n### Example Commands\n\n* Get user data for username *gvanrossum* and save results to *results.xlsx* file on behalf of another account.\n    * `python twitter_data_extractor.py -u gvanrossum`\n\n* Get user data for username *gvanrossum* and save results to *results.xlsx* file for your own account.\n    * `python twitter_data_extractor.py --forme -u gvanrossum`\n\n* Get user data for *gvanrossum* and write the results to *results.xlsx* by getting parameters from the default config file(*config.json*).\n    * `python twitter_data_extractor.py --forme -c`\n\n* Get user data for *gvanrossum* and write the results to *results.xlsx* by getting parameters from the given config file.\n    * `python twitter_data_extractor.py --forme -c -cf /home/coskun/custom_config.json`\n\n* Get user data for usernames *gvanrossum* and *nedbat*.\n    * `python twitter_data_extractor.py -ul \"gvanrossum,nedbat\"`\n\n* Get friends data for username *gvanrossum* and save results to *results.csv* file.\n    * `python twitter_data_extractor.py -u gvanrossum -fr -ot csv -of results.csv`\n\n* Get followers data for username *gvanrossum*.\n    * `python twitter_data_extractor.py -u gvanrossum -fl`\n\n* Get the last tweets data for username *gvanrossum*.\n    * `python twitter_data_extractor.py --forme -u gvanrossum -ut`\n\n* Get the last 50 tweets data for username *gvanrossum* and exclude both *replies* and *retweets*.\n    * `python twitter_data_extractor.py --forme -u gvanrossum -ut -tc 50 -e \"replies,retweets\"`\n\n* Get the last 50 tweets data for keyword *python*.\n    * `python twitter_data_extractor.py -s python -tc 50`\n\n* Get the last tweets data for keyword *python* and write results to Google Sheets document with name *last_tweets* and share with the given email.\n    * `python twitter_data_extractor.py -s python -ot gsheets -of last_tweets -sm codenineeight@gmail.com`\n\n### Example Runs \u0026 Outputs\n\n* Search the last 20 tweet data for the keyword \"python\" and save it to the \"results.xlsx\" file.\n    * `python twitter_data_extractor.py -s python -tc 20`\n\n![Tweets Search](assets/python_tweet_search.gif)\n\n* Get the last 5 tweets data excluding replies and retweets for the username \"gvanrossum\" and write results to Google Sheets document named as \"gvanrossum_last_tweets\" and share with the given email.\n    * `python twitter_data_extractor.py --forme -u gvanrossum -ut -tc 5 -e \"replies-retweets\" -ot gsheets -of gvanrossum_last_tweets -sm codenineeight@gmail.com`\n\n![User Tweets Run](assets/gvanrossum_tweets.gif)\n\n![User Tweets Output](assets/gvanrossum_tweets.png)\n\n---\n\n## Support\n\nIf you need support, you can contact me by emailing to codenineeight@gmail.com with the “twitter_data_extractor” prefix in the subject. You can also see my Upwork profile [here](https://www.upwork.com/freelancers/~011e3fe44e575092f0).\n\nIf you benefit from this tool, please consider donating using the sponsor links.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoskundeniz%2Ftwitter-data-extractor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcoskundeniz%2Ftwitter-data-extractor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoskundeniz%2Ftwitter-data-extractor/lists"}