Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/haoict/searchable-book-crawler


https://github.com/haoict/searchable-book-crawler

Last synced: 11 days ago
JSON representation

Awesome Lists containing this project

README

        

start crawl by scrapy:
$ cd scrapy_book
$ scrapy crawl timsach -o ../json/booksfull.json

download book cover image
$ python downloadimage.py ./json/books500.json

get ebook download link
$ python downloadebook.py ./json/books500.json

migrate to mongodb
$ python json2mongo.py ./json/books500.json