https://github.com/burnzz/scrapy-twitter
Web scraper based on Scrapy to fetch tweets from a list of user accounts
https://github.com/burnzz/scrapy-twitter
bot crawler scraping scrapy twitter
Last synced: 9 months ago
JSON representation
Web scraper based on Scrapy to fetch tweets from a list of user accounts
- Host: GitHub
- URL: https://github.com/burnzz/scrapy-twitter
- Owner: BurnzZ
- License: mit
- Created: 2017-05-04T18:12:52.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2017-05-11T17:55:10.000Z (about 9 years ago)
- Last Synced: 2025-06-06T09:03:56.622Z (about 1 year ago)
- Topics: bot, crawler, scraping, scrapy, twitter
- Language: Python
- Homepage:
- Size: 19.5 KB
- Stars: 14
- Watchers: 0
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
This is a web scraper for fetching tweets from a list of user accounts,
without using twitter's API to avoid its rate limiting.
## USAGE
`scrapy crawl twitter -a urls_file=url.txt -a urls_link=https://pastebin.com/raw/XXX123 -a combine_urls=True`
**Parameters**|**Description**
:-----:|:-----:
urls_file|local path to file
urls_link|link to an online resource
combine_urls|*Optional*. Links from both *urls_file* and *urls_link* are combined. *Default: False*
Both `urls_file` and `urls_link` must only contain links which are newline separated.
## MOTIVATION
I use this personally to keep track of twitter users who consistently tweet stock trading
speculations for the **Philippine Stock Exchange** (*PSE*). Spiders in this project are
deployed on my personal Scrapinghub platform.