https://github.com/jshyunbin/comment_crawler
Web crawler for online shopping mall comments using python selectolax and requests.
https://github.com/jshyunbin/comment_crawler
python webcrawler
Last synced: 9 months ago
JSON representation
Web crawler for online shopping mall comments using python selectolax and requests.
- Host: GitHub
- URL: https://github.com/jshyunbin/comment_crawler
- Owner: jshyunbin
- Created: 2023-01-03T01:32:15.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-01-16T03:12:25.000Z (over 3 years ago)
- Last Synced: 2025-04-22T12:58:50.671Z (about 1 year ago)
- Topics: python, webcrawler
- Language: Python
- Homepage:
- Size: 18.6 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π Comment Crawler
This web crawler targets korean online shopping mall comments.
```zsh
python src/main.py -url "http://item.gmarket.co.kr/Item?goodscode=2405613985&ver=638083484789237196"
```
## Supporting Websites
- [Gλ§μΌ](http://www.gmarket.co.kr)
- [11λ²κ°](https://www.11st.co.kr) (wip)
- [Naver shopping](https://shopping.naver.com/home)
- [coupang](https://www.coupang.com) (wip)
## How to use
Comment crawler uses flags to specify
| Flags | type | description | default |
|---------------|---------|------------------------------------------------------------------|----------|
| url | string | URL link of the online shopping website. | None |
| max_collect | int | Maximum number of comments to collect. -1 if no limit. | -1 |
| collect_empty | boolean | Collects empty comments if set to True | True |
| browser | enum | Browser used for selenium. Can choose between Safari and Firefox | 'Safari' |