https://github.com/addono/bookbe-spider

Spider for book titles at book.be.
https://github.com/addono/bookbe-spider

Last synced: 8 months ago
JSON representation

Spider for book titles at book.be.

Host: GitHub
URL: https://github.com/addono/bookbe-spider
Owner: Addono
Created: 2019-12-23T12:45:42.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2024-05-14T22:16:33.000Z (about 2 years ago)
Last Synced: 2024-12-27T00:27:29.489Z (over 1 year ago)
Language: Python
Homepage: https://gist.github.com/Addono/de7b0633d7faa1da3aeaf1f43985b163
Size: 4.88 KB
Stars: 1
Watchers: 3
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Book.be Scraper

## About

Collects book titles from [book.be](https://book.be). See [this](https://gist.github.com/Addono/de7b0633d7faa1da3aeaf1f43985b163) example of a scrape getting the titles of all books released in 2019.

## Installation

Create a virtual Python 3 environment.
```bash
virtualenv venv
```

Enable the environment.
```bash
source venv/bin/activate
```

Install all requirements.
```bash
pip install -r requirements.txt
```

## Usage

Run the spider.

```bash
scrapy runspider spider.py
```

Or if you want to save the output gathered by the spider, e.g. as CSV:
```bash
scrapy runspider spider.py --output=res.csv -t csv
```

If you want to filter the collected titles, then change `URL` in `spider.py` to include the desired filters.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/addono/bookbe-spider

Awesome Lists containing this project

README