Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/altimis/scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers, user info, images...
https://github.com/altimis/scweet
dowload-images followers following python save-image scrape scrape-followers scrape-following scrape-images scrape-likes scrape-tweets scraper scraping selenium-webdriver tweets twitter twitter-scraper
Last synced: 4 days ago
JSON representation
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers, user info, images...
- Host: GitHub
- URL: https://github.com/altimis/scweet
- Owner: Altimis
- License: mit
- Created: 2020-12-14T14:36:34.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2025-01-14T16:57:24.000Z (19 days ago)
- Last Synced: 2025-01-30T01:04:26.479Z (4 days ago)
- Topics: dowload-images, followers, following, python, save-image, scrape, scrape-followers, scrape-following, scrape-images, scrape-likes, scrape-tweets, scraper, scraping, selenium-webdriver, tweets, twitter, twitter-scraper
- Language: Python
- Homepage:
- Size: 27.6 MB
- Stars: 1,104
- Watchers: 17
- Forks: 227
- Open Issues: 93
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE.txt
Awesome Lists containing this project
README
# Recent X Platform Changes & Scweet Updates
**Note:** Scweet has recently encountered issues due to changes on X (formerly Twitter). We’re committed to updating the library so it continues working for smaller or personal scraping tasks. However, keeping Scweet fully operational at large scale now requires near-daily maintenance, given X’s frequent policy and technical shifts.
For those needing robust, continuous, or high-volume scraping, consider using [Scweet on Apify](https://apify.com/altimis/scweet). It automatically handles the scaling and infrastructure behind the scenes—fetching up to 1000 tweets per minute and storing results in a neat dataset. It’s still the Scweet experience you know, just supercharged in the cloud.
**(Responsible Use Reminder: Whether local or cloud-based, please scrape tweets ethically, lawfully, and respectfully.)**
# A simple and unlimited Twitter scraper with python.
Recently, Twitter has banned almost every Twitter scraper. This repository presents an alternative tool to scrape Twitter based on 3 functions:
- [scrape](https://github.com/Altimis/Scweet/blob/master/Scweet/scweet.py): Scrapes all the information regarding tweets between two given dates, for a given language and list of words or account name, in the form of a csv file containing retrieved data (more storage methods will be added).
- [get_user_information](https://github.com/Altimis/Scweet/blob/master/Scweet/user.py): Scrapes users information, incluing number of following and followers, location and description.
- [get_users_followers and get_users_following](https://github.com/Altimis/Scweet/blob/master/Scweet/user.py): Scrapes followers and following accounts for a given list of users.It is also possible to download the images showed in tweets by passing the argument `save_images = True`. If you only want to scrape images, it is recommended to set the argument `display_type = image` to show only tweets that contain images.
Authentication is required for scraping followers/following. It is recommended to log in with a new account, otherwise the account could be banned if the list of followers is very long. To log in to your account, you need to enter your username `SCWEET_USERNAME` and password `SCWEET_PASSWORD` in the [.env](https://github.com/Altimis/Scweet/blob/master/.env) file. You can control the `wait` parameter in the `get_users_followers` and `get_users_following` functions according to you internet speed.
## Requirements :
`pip install -r requirements.txt`
Note : You must have Chrome installed on your system.
## Results :
### Tweets :
The CSV file contains the following features (for each tweet) :
- 'UserScreenName' :
- 'UserName' : UserName
- 'Timestamp' : timestamp of the tweet
- 'Text' : tweet text
- 'Embedded_text' : embedded text written above the tweet. This can be an image, a video or even another tweet if the tweet in question is a reply
- 'Emojis' : emojis in the tweet
- 'Comments' : number of comments
- 'Likes' : number of likes
- 'Retweets' : number of retweets
- 'Image link' : link of the image in the tweet
- 'Tweet URL' : tweet URL### Following / Followers :
The `get_users_following` and `get_users_followers` in [user](https://github.com/Altimis/Scweet/blob/master/Scweet/user.py) file give a list of following and followers for a given list of users.
## Usage :
### Library :
The library is now available. To install the library, run :
`pip install Scweet==1.8`
After the installation, you can import and use the functions as follows:
```
from Scweet.scweet import scrape
from Scweet.user import get_user_information, get_users_following, get_users_followers
```**Scrape top tweets with the words 'bitcoin', 'ethereum' geolocated less than 200 km from Alicante (Spain) Lat=38.3452, Long=-0.481006 and without replies:**
**The process is slower as the interval is smaller (choose an interval that can divide the period of time between, start and max date)**```
data = scrape(words=['bitcoin','ethereum'], since="2021-10-01", until="2021-10-05", from_account = None, interval=1, headless=False, display_type="Top", save_images=False, lang="en",
resume=False, filter_replies=False, proximity=False, geocode="38.3452,-0.481006,200km")
```**Scrape top tweets of with the hashtag #bitcoin, in proximity and without replies:**
**The process is slower as the interval is smaller (choose an interval that can divide the period of time between, start and max date)**```
data = scrape(hashtag="bitcoin", since="2021-08-05", until=None, from_account = None, interval=1,
headless=True, display_type="Top", save_images=False,
resume=False, filter_replies=True, proximity=True)
```**Get the main information of a given list of users:**
**These users follow me on Twitter**```
users = ['nagouzil', '@yassineaitjeddi', 'TahaAlamIdrissi',
'@Nabila_Gl', 'geceeekusuu', '@pabu232', '@av_ahmet', '@x_born_to_die_x']
```**This function will return a list that contains : **
**["no. of following","no. of followers", "join date", "date of birth", "location", "website", "description"]**```
users_info = get_user_information(users, headless=True)
```**Get followers and following of a given list of users**
**Enter your username and password in .env file. I recommend you do not use your main account.**
**Increase wait argument to avoid banning your account and maximize the crawling process if the internet is slow. I used 1 and it's safe.****Set your .env file with `SCWEET_EMAIL` , `SCWEET_USERNAME` and `SCWEET_PASSWORD` variables and provide its path**
```
env_path = ".env"following = get_users_following(users=users, env=env_path, verbose=0, headless=True, wait=2, limit=50, file_path=None)
followers = get_users_followers(users=users, env=env_path, verbose=0, headless=True, wait=2, limit=50, file_path=None)
```### Terminal :
```
Scrape tweets.optional arguments:
-h, --help show this help message and exit
--words WORDS Words to search for. they should be separated by "//" : Cat//Dog.
--from_account FROM_ACCOUNT
Tweets posted by "from_account" account.
--to_account TO_ACCOUNT
Tweets posted in response to "to_account" account.
--mention_account MENTION_ACCOUNT
Tweets that mention "mention_account" account.
--hashtag HASHTAG
Tweets containing #hashtag
--until UNTIL End date for search query. example : %Y-%m-%d.
--since SINCE
Start date for search query. example : %Y-%m-%d.
--interval INTERVAL Interval days between each start date and end date for
search queries. example : 5.
--lang LANG Tweets language. Example : "en" for english and "fr"
for french.
--headless HEADLESS Headless webdrives or not. True or False
--limit LIMIT Limit tweets to be scraped.
--display_type DISPLAY_TYPE
Display type of Twitter page : Latest or Top tweets
--resume RESUME Resume the last scraping. specify the csv file path.
--proxy PROXY Proxy server
--proximity PROXIMITY Proximity
--geocode GEOCODE Geographical location coordinates to center the
search (), radius. No compatible with proximity
--minreplies MINREPLIES
Min. number of replies to the tweet
--minlikes MINLIKES Min. number of likes to the tweet
--minretweets MINRETWEETS
Min. number of retweets to the tweet
```### To run the script :
`python scweet.py --words "excellente//car" --to_account "tesla" --until 2020-01-05 --since 2020-01-01 --limit 10 --interval 1 --display_type Latest --lang="en" --headless True`