{"id":20840950,"url":"https://github.com/souvic/mtweepy","last_synced_at":"2025-03-12T10:16:17.411Z","repository":{"id":57443827,"uuid":"382792218","full_name":"Souvic/mtweepy","owner":"Souvic","description":"Fastest scraping using multiple apps and user tokens for Twitter API!","archived":false,"fork":false,"pushed_at":"2021-07-15T06:55:23.000Z","size":83,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-20T17:19:05.536Z","etag":null,"topics":["scraping","scraping-python","scraping-web","tweepy","twitter","twitter-api"],"latest_commit_sha":null,"homepage":"https://github.com/Souvic/mtweepy","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Souvic.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-07-04T07:37:59.000Z","updated_at":"2021-07-15T06:55:26.000Z","dependencies_parsed_at":"2022-09-10T20:22:55.057Z","dependency_job_id":null,"html_url":"https://github.com/Souvic/mtweepy","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Souvic%2Fmtweepy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Souvic%2Fmtweepy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Souvic%2Fmtweepy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Souvic%2Fmtweepy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Souvic","download_url":"https://codeload.github.com/Souvic/mtweepy/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243196657,"owners_count":20251861,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["scraping","scraping-python","scraping-web","tweepy","twitter","twitter-api"],"created_at":"2024-11-18T01:18:19.545Z","updated_at":"2025-03-12T10:16:17.381Z","avatar_url":"https://github.com/Souvic.png","language":"Python","funding_links":["https://www.buymeacoffee.com/Souvic"],"categories":[],"sub_categories":[],"readme":"# Makes twitter scrapping with multiple twitters apps easy again!\n[![License: MIT](https://img.shields.io/github/license/Souvic/mtweepy)](https://opensource.org/licenses/MIT)\n[![stars](https://img.shields.io/github/stars/Souvic/mtweepy)]()\n[![Github All Releases](https://img.shields.io/github/downloads/huggingface/transformers/total.svg)]()\n[![PyPI](https://img.shields.io/pypi/v/mtweepy)](https://pypi.org/project/mtweepy/)\n[![python](https://img.shields.io/github/languages/top/Souvic/mtweepy)]()\n\n[![Build Status](https://scrutinizer-ci.com/g/Souvic/mtweepy/badges/build.png?b=main)](https://scrutinizer-ci.com/g/Souvic/mtweepy/build-status/main)\n[![Scrutinizer Code Quality](https://scrutinizer-ci.com/g/Souvic/mtweepy/badges/quality-score.png?b=main)](https://scrutinizer-ci.com/g/Souvic/mtweepy/?branch=main)\n[![Release date](https://img.shields.io/github/release-date/Souvic/mtweepy)]()\n[![Latest Stable Version](https://img.shields.io/github/v/release/Souvic/mtweepy)]()\n\n[![tweet](https://img.shields.io/twitter/url?style=social\u0026url=https%3A%2F%2Fgithub.com%2FSouvic%2Fmtweepy)](https://twitter.com/intent/tweet?text=I%20found%20this%20awesome%20repo%20on%20github%20%26%20PyPI%20that%20makes%20twitter%20scraping%20fastest%20with%20multpl%20token%20support%2C%20oauth1%262!!\u0026url=https%3A%2F%2Fgithub.com%2FSouvic%2Fmtweepy)\n\n### Support me\n\n\n[![Buy Me A Coffee](https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png)](https://www.buymeacoffee.com/Souvic)\n\n\n\n## Install from PyPi\n```\npip3 install mtweepy\n```\n\n## Or Install from main branch\n```\npip3 install git+https://github.com/Souvic/mtweepy.git\n```\n\n# Example usage\nThere are three functions in the repo: get_followers, get_timelines, get_users.\n\nAll the functions use all the auth tokens optimally for fastest scraping.\n\nApart from self explantory inputs:\n\n1. As auths, a list of tweepy bearer tokens are expected if you want to use oauth2 limits for twitter api.\n2. As auths, a list of \\[_oauth_consumer_key, oauth_consumer_secret, client_secret, oauth_token, oauth_token_secret_] are expected if you want to use oauth1 limits for twitter api.\n3. use_userid parameter is by default _False_. If it is passed as _True_ in get_followers, get_followers will treat the screen_name_or_userid parameter as userid for which follower is to be scraped.\n4. output_folder is supposed to be an empty folder to save output from get_timelines and get_users functions.\n\nAn example usage is provided here.\n### Gets 5000*ceil(max_num/5000) number of followers' userids as a list for screen_name INCIndia\n\n```\nfrom mtweepy import get_followers, get_users, get_timelines\nlist_followers= get_followers(auths, \"INCIndia\", max_num=500)#gets list of followers appended in chunk of 5000, if max_num\u003c5000, will get last 5000 followers.\n```\n\n### Gets all the maximally extended user objects for list_followers(a list of user ids)\nThe output is saved in the output_folder as multiple jsonl files(one file per access token).\nEach line of jsonl files contains the maximally extended user object for one user.\n```\nget_users(auths, list_followers, output_folder=\"./testfolder1\")\n\n```\n\n### Gets all the tweets in the timelines of list_followers(a list of user ids)\nThe output is saved in the output_folder as multiple jsonl files(one file per access token).\nEach line of jsonl files contains last 3200 tweets of a user.\n\n```\nget_timelines(auths, list_followers, output_folder=\"./testfolder2\")\n```\n### To get the total number of lines written in files in the directory ./testfolder1\nType this in commandline at any point of data collection\n```\nfind ./testfolder1 -name '*.jsonl' | xargs wc -l\n```\nFor _get_users function_: Each line contains 100 users approximately.\nFor _get_timelines function_: Each line contains 1 user timeline.\n\nSo you can calculate an approximate rate with this function to know when data collection will be finished.\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsouvic%2Fmtweepy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsouvic%2Fmtweepy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsouvic%2Fmtweepy/lists"}