https://github.com/thenets/simplewebcrawler
Simple crawler example created with Scrapy (Python 3)
https://github.com/thenets/simplewebcrawler
Last synced: 8 months ago
JSON representation
Simple crawler example created with Scrapy (Python 3)
- Host: GitHub
- URL: https://github.com/thenets/simplewebcrawler
- Owner: thenets
- License: apache-2.0
- Created: 2017-10-25T12:56:38.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-10-25T18:24:57.000Z (over 8 years ago)
- Last Synced: 2025-02-14T21:46:56.968Z (over 1 year ago)
- Language: Python
- Size: 11.7 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Crawler
Simple crawler example created with Scrapy (Python 3)
## Requirements
Install Ubuntu dependencies:
```
# sudo apt install -y virtualenv python-pip
```
Create virtualenv and install Python dependencies
```
$ virtualenv env # Create virtualenv
$ . ./env/bin/activate # Enable virtualenv
$ pip install -r pip-requirements.txt # Install Python dependencies on virtualenv
```
## How to run
Run spiders:
```
$ ./run-spiders.sh
```
Output will be added to `./out/`.