Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sonhm3029/crawl-data-bot
This project making a base crawl data from web bot, include text data and images data
https://github.com/sonhm3029/crawl-data-bot
crawler google medical vietnamese
Last synced: about 18 hours ago
JSON representation
This project making a base crawl data from web bot, include text data and images data
- Host: GitHub
- URL: https://github.com/sonhm3029/crawl-data-bot
- Owner: sonhm3029
- Created: 2023-11-29T15:42:57.000Z (12 months ago)
- Default Branch: master
- Last Pushed: 2024-06-23T11:13:36.000Z (5 months ago)
- Last Synced: 2024-06-23T12:28:12.474Z (5 months ago)
- Topics: crawler, google, medical, vietnamese
- Language: HTML
- Homepage:
- Size: 596 KB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
"# Crawl-data-bot"
Reference:
![https://stackoverflow.com/questions/67249348/how-to-scrape-high-resolution-images-from-google-images-using-bs4-in-python](https://stackoverflow.com/questions/67249348/how-to-scrape-high-resolution-images-from-google-images-using-bs4-in-python)
## Vinmec URL list
- https://www.vinmec.com/vi/tin-tuc/thong-tin-suc-khoe/san-phu-khoa-va-ho-tro-sinh-san/
- https://www.vinmec.com/vi/tin-tuc/thong-tin-suc-khoe/nhi/
- https://www.vinmec.com/vi/tin-tuc/thong-tin-suc-khoe/suc-khoe-tong-quat/
- https://www.vinmec.com/vi/tin-tuc/thong-tin-suc-khoe/te-bao-goc-cong-nghe-gen/
- https://www.vinmec.com/vi/tin-tuc/thong-tin-suc-khoe/dich-2019-ncov/
- https://www.vinmec.com/tin-tuc/thong-tin-suc-khoe/dinh-duong/
- https://www.vinmec.com/tin-tuc/thong-tin-suc-khoe/song-khoe/
- https://www.vinmec.com/tin-tuc/thong-tin-suc-khoe/lam-dep/
- https://www.vinmec.com/tin-tuc/thong-tin-suc-khoe/thong-tin-duoc/
{
"url": "https://www.vinmec.com/tin-tuc/thong-tin-suc-khoe/song-khoe/",
"num_pages": 169,
"category": "Sống khỏe"
},