Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/haoict/searchable-book-crawler
https://github.com/haoict/searchable-book-crawler
Last synced: 11 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/haoict/searchable-book-crawler
- Owner: haoict
- Created: 2018-09-02T08:03:22.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2018-09-04T18:53:33.000Z (over 6 years ago)
- Last Synced: 2024-10-28T14:24:46.269Z (about 2 months ago)
- Language: Python
- Size: 43 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
start crawl by scrapy:
$ cd scrapy_book
$ scrapy crawl timsach -o ../json/booksfull.jsondownload book cover image
$ python downloadimage.py ./json/books500.jsonget ebook download link
$ python downloadebook.py ./json/books500.jsonmigrate to mongodb
$ python json2mongo.py ./json/books500.json