https://github.com/addono/worldcat-scraper
Spider for book titles from Worldcat.
https://github.com/addono/worldcat-scraper
Last synced: 12 months ago
JSON representation
Spider for book titles from Worldcat.
- Host: GitHub
- URL: https://github.com/addono/worldcat-scraper
- Owner: Addono
- Created: 2019-12-22T15:31:37.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2023-11-28T23:00:53.000Z (over 2 years ago)
- Last Synced: 2025-03-25T04:22:28.135Z (about 1 year ago)
- Language: Python
- Homepage: https://gist.github.com/Addono/ad031c113138e0254112c3f5f96645b8
- Size: 14.6 KB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Worldcat Scraper
Collects book titles from [Worldcat](https://worldcat.org/)'s search pages. See [this](https://gist.github.com/Addono/ad031c113138e0254112c3f5f96645b8) example of a scrape getting the titles of all Dutch fiction books released in 2019.
*Note: Worldcat search only allows you to scroll through the first 5000 hits.*
## Usage
```bash
scrapy runspider spider.py
```
Or if you want to save the output, e.g. as CSV:
```bash
scrapy runspider spider.py --output=res.csv -t csv
```