https://github.com/mhmdrazn/twitter-data-crawl
This repository contains a script to scrape tweets from Twitter search results based on keywords, date range, and geolocation using Playwright.
https://github.com/mhmdrazn/twitter-data-crawl
Last synced: 6 months ago
JSON representation
This repository contains a script to scrape tweets from Twitter search results based on keywords, date range, and geolocation using Playwright.
- Host: GitHub
- URL: https://github.com/mhmdrazn/twitter-data-crawl
- Owner: mhmdrazn
- Created: 2024-06-30T17:49:32.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-06-30T18:04:37.000Z (12 months ago)
- Last Synced: 2024-07-04T02:48:53.064Z (12 months ago)
- Language: Jupyter Notebook
- Size: 15.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Twitter Scraper using Playwright
This repository contains a script to scrape tweets from Twitter search results based on keywords, date range, and geolocation using Playwright.## Features
- Scrape tweets based on specific keywords.
- Filter tweets by date range.
- Specify geolocation for more targeted results.## Prerequisites
Before you begin, ensure you have met the following requirements:
- Node.js installed on your machine.
- Playwright installed. You can install it using the following command:
`npm install playwright`## Usage
To run the scraper, use the following command:
`!npx -y [email protected] -o "{filename}" -s "{search_keyword}" --tab "LATEST" -l {limit} --token {twitter_auth_token}`## Arguments
- `filename` - The filename of csv typefile to store the output.
- `search_keywords` - A comma-separated list of keywords to search for.
- `LATEST` - the time filtering to find the latest tweets.
- `limit` - The limit of the tweets will be crawled.
- `twitter_auth_token` - Your personal Twitter auth token.## Credits
This project was inspired by and based on the work of [Helmi Satria](https://github.com/helmisatria).