Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tomashubelbauer/puppeteer-bazos-cz-scraper
Bazos.cz scraper built using Puppeteer used for obtaining search results as JSON.
https://github.com/tomashubelbauer/puppeteer-bazos-cz-scraper
bazos bazos-cz node nodejs puppeteer scraper
Last synced: 3 days ago
JSON representation
Bazos.cz scraper built using Puppeteer used for obtaining search results as JSON.
- Host: GitHub
- URL: https://github.com/tomashubelbauer/puppeteer-bazos-cz-scraper
- Owner: TomasHubelbauer
- License: mit
- Created: 2017-12-27T12:37:14.000Z (about 7 years ago)
- Default Branch: main
- Last Pushed: 2022-04-28T08:51:55.000Z (over 2 years ago)
- Last Synced: 2024-05-02T03:55:53.443Z (8 months ago)
- Topics: bazos, bazos-cz, node, nodejs, puppeteer, scraper
- Language: JavaScript
- Homepage:
- Size: 95.9 MB
- Stars: 7
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Puppeteer Bazos.cz Scraper
Bazos.cz scraper built using Puppeteer used for obtaining search results as JSON.
## Running
- `npm run demo` to run a demo query *iPhone in Prague 1 between 3000 and 6000 CZK*
![](screenshot.gif)
- `npm start -- search {query} {zip} -f {priceFrom} -t {priceTo}` for custom query (headless)
- `npm start -- search {query} {zip} -f {priceFrom} -t {priceTo} -w` for custom query (non-headless)
- `npm start -- -h` for the program help
- `npm start -- search -h` for the `search` command helpYou can add `--record` to have the script produce the `screenshot.gif` animation.
## To-Do
### Open detail for each post and fetch full description from it.
### Implement diff to report updates/inserts in a separate file.
### Run this in Github Actions using a scheduled trigger
Push the generated report back to the repo and set up GitHub
Pages for the repository to show it on a live URL.### Consider having the pipeline send out an email with a diff
### Consider going directly to the search URL instead of filling in the form
The search URL structure is likely to be more stable than the form DOM as
Bazos might consider people who bookmark search results, but has no reason
to care about the form DOM being stable for 3rd parties.### Use my library node-puppeteer-apng for the screencast
Find a way to capture the frames at a lower resolution to keep the size down.