Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tcurvelo/scrapy-tor
Docker settings for running Scrapy spiders over the Tor network.
https://github.com/tcurvelo/scrapy-tor
hacktoberfest scrapy tor
Last synced: about 2 months ago
JSON representation
Docker settings for running Scrapy spiders over the Tor network.
- Host: GitHub
- URL: https://github.com/tcurvelo/scrapy-tor
- Owner: tcurvelo
- License: mit
- Created: 2022-01-28T23:16:05.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-05-05T19:00:06.000Z (over 2 years ago)
- Last Synced: 2023-08-02T20:25:55.119Z (over 1 year ago)
- Topics: hacktoberfest, scrapy, tor
- Language: Shell
- Homepage:
- Size: 7.81 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# scrapy-tor
Docker settings for running Scrapy spiders over the Tor network.
## Usage
### Quick demo
❯ docker run --rm tcurvelo/scrapy-tor
It will wait for the Tor circuit to be established then request the [Tor Project](https://check.torproject.org)'s check page. About a minute later you'll see its title confirming that it worked:
...
2022-04-30 23:46:21 [torcheck] INFO:✨ Congratulations. This browser is configured to use Tor. ✨
...### Launching a Tor-enabled Scrapy shell
docker run -it --rm tcurvelo/scrapy-tor scrapy shell
### Bringing it into your Scrapy project
Simply extend it in your `Dockerfile`:
FROM tcurvelo/scrapy-tor
COPY . .
RUN pip install -r requirements.txtThen build and run:
❯ docker build . -t my-scrapy-project
❯ docker run -it my-scrapy-project scrapy crawl my_spider🕷️🧅