Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/naem1023/comic-crawler
Comic crawler.
https://github.com/naem1023/comic-crawler
beautifulsoup crawler python3
Last synced: 24 days ago
JSON representation
Comic crawler.
- Host: GitHub
- URL: https://github.com/naem1023/comic-crawler
- Owner: naem1023
- License: mit
- Created: 2020-01-01T08:43:03.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2022-11-22T10:14:05.000Z (almost 2 years ago)
- Last Synced: 2023-02-28T15:12:54.310Z (over 1 year ago)
- Topics: beautifulsoup, crawler, python3
- Language: Python
- Homepage:
- Size: 18.5 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Comic Cralwer
## Windows-10-build
- Copy & paste 'website.txt', 'setting.conf', 'header.json' to directory of 'windows-10'.
- If you want samples of files, contact to me.
- If chromedriver doesn't work, update chrome and download chromedriver matched with your chrome version.
- Run 'main.exe'.## Other OS build
This program can run in any machine that can install all requirments of this program. So, run with python or build executable file with pyinstaller.
---
## Run with python
### Requirements
- python3
- chrome web driver from https://chromedriver.chromium.org/downloads
- modules in requirements.txt
- headers.json, setting.conf, website.txt
- If you want samples of files, contact to me.
```
pip3 install -r requirements.txt
```
---
### website.txt```
1. how many
2. website address
3. next button, prev button(just write 'next', 'prev')
4. comic name
5. comic site
```### Example of webisite.txt
```
1
www.example.com/1
next
comic_name
haha
```
---
### headers.json```
{
"identifier of site ": {
"User-Agent": "Names of User-Agent"
},
}
```### Example of header.json
```
{
"m": {
"User-Agent": "Mozilla/5.0"
},
}
```
---
### setting.conf```
comic_site=site_names
windows_save_path=path
darwin_save_path=path
linux_save_path=path
thumbnail=address_of_thumbnail
unnecessary_string=address_of_unnecessary_string
```### Example of setting.conf
```
comic_site=haha,hoho
windows_save_path=C:\\a\\b
darwin_save_path=/home/user/a/b/
linux_save_path=/home/user/a/b/
thumbnail=https://example_thumbnail
unnecessary_string=https://example_unnecessary
```### Run
```
python main.py
```