https://github.com/chusiang/crawler-book-info
A crawler for quick parser the book information
https://github.com/chusiang/crawler-book-info
book crawler python
Last synced: about 1 year ago
JSON representation
A crawler for quick parser the book information
- Host: GitHub
- URL: https://github.com/chusiang/crawler-book-info
- Owner: chusiang
- License: mit
- Created: 2017-03-01T06:03:53.000Z (over 9 years ago)
- Default Branch: main
- Last Pushed: 2024-06-24T14:07:36.000Z (almost 2 years ago)
- Last Synced: 2025-04-12T02:24:33.545Z (about 1 year ago)
- Topics: book, crawler, python
- Language: Python
- Homepage:
- Size: 92.8 KB
- Stars: 5
- Watchers: 2
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Crawler Book Info
[](https://travis-ci.org/chusiang/crawler-book-info) [](https://hub.docker.com/r/chusiang/crawler-book-info/)
[](https://hub.docker.com/r/chusiang/crawler-book-info/) [](https://microbadger.com/images/chusiang/crawler-book-info "Get your own image badge on microbadger.com") [](LICENSE)
A sample crawler for quick parser some books information.
## Initialization
1. Install the [pyenv][pyenv] and [pyenv-virtualenv][py-venv].
1. create virtualenv of `py3`.
``` shell
[ jonny@xenial ~/vcs/crawler-book-outline ]
$ pyenv virtualenv 3.9.6 py3
```
1. Use `py3` virtualenv under this directory.
``` shell
[ jonny@xenial ~/vcs/crawler-book-outline ]
$ pyenv local py3
```
1. Install packages with pip.
``` shell
(py3) [ jonny@xenial ~/vcs/crawler-book-outline ]
$ pip3 install -r requirements.txt
```
[pyenv]: https://github.com/pyenv/pyenv-virtualenv
[py-venv]: https://github.com/pyenv/pyenv-virtualenv
## Usage
### tenlong.com.tw
1. Run crawler with **ISBN-13**.
``` shell
(.py3) [ jonny@xenial ~/vcs/crawler-book-outline ]
$ python3 tenlong.py 9781491915325
```
### books.com.tw
1. Run crawler with **url**.
``` shell
(py3) [ jonny@xenial ~/vcs/crawler-book-outline ]
$ python3 books.py https://www.books.com.tw/products/0010810939
```
1. Run crawler with **product number**.
``` shell
(py3) [ jonny@xenial ~/vcs/crawler-book-outline ]
$ python3 books.py 0010810939
```
> Not support the **ISBN-13** args yet on books.com.tw.
### View Result
1. Open html via Firefox on GNU/Linux.
``` shell
(py3) [ jonny@xenial ~/vcs/crawler-book-outline ]
$ firefox index.html
```

1. We can see the
, it is clean, now.
### Run local Nginx for Evernote Web Clipper
The **Evernote Web Clipper** is not support local files, so we can clip it with Nginx.
1. Run Nginx container.
``` shell
docker run --name nginx -v "$(pwd)":/usr/share/nginx/html/ -p 80:80 -d nginx
```
1. Open html via Firefox on GNU/Linux.
``` shell
(py3) [ jonny@xenial ~/vcs/crawler-book-outline ]
$ firefox http://localhost
```
1. Finally, we can clip the information to Evernote with [Evernote Web Clipper](https://evernote.com/intl/zh-tw/webclipper/).
## License
Copyright (c) chusiang from 2017-2024 under the MIT license.