Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jenting/compare-drugstore-price
Compare price between cosmeceutical shops
https://github.com/jenting/compare-drugstore-price
cosmed crawler golang poya side-project watsons
Last synced: about 2 months ago
JSON representation
Compare price between cosmeceutical shops
- Host: GitHub
- URL: https://github.com/jenting/compare-drugstore-price
- Owner: jenting
- License: mit
- Created: 2018-04-20T12:13:15.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2018-09-12T02:31:43.000Z (over 6 years ago)
- Last Synced: 2024-10-16T13:41:58.152Z (3 months ago)
- Topics: cosmed, crawler, golang, poya, side-project, watsons
- Language: Go
- Size: 1.11 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Compare Drugstore Prices
Compare prices between drugstores (`Watson` and `Poya`).
[![Build Status](https://travis-ci.com/hsiaoairplane/compare-drugstore-price.svg?branch=master)](https://travis-ci.com/hsiaoairplane/compare-drugstore-price)
## go version
Please use go version >= 1.11
## Setup
First, download the project:
```sh
go get github.com/hsiaoairplane/compare-drugstore-price
```Then run the project:
```sh
./run.sh
```## Crawling with in-memory cache (with timeout mechanism)
The crawling steps:
1) Get the query name, product name, product price, shop name, and update time from in-memory cache
2) If query name exist in-memory cache, go to 7)
3) If query name not exist in-memory cache, go to 4)
4) Send HTTP GET request with parameter query name to multiple drugstores' URL with goroutine
5) Parse all the products' name and price from HTTP GET response HTML content
6) Save the query name, product name, product price, shop name to in-memory cache
7) Return the product name, product price, and shop name back to client## Architecture
+------------+
+-------------------------------------------------------+ inserts job to job queue | Client 1 |
| | | | | | +------------+
| | | | Job Queues | |<----------------------+
| | | | | | | +------------+
+-------------------------------------------------------+ | | Client 2 |
| +--------------------+ +------------+
| | | HTTP GET
| Worker get job from job queue | APIServer | <---------> .
| | | .
| +--------------------+ .
+---------------------------v---------------------------+ ^
| | | +------------+
| +----+ +----+ +----+ | | | Client n |
| | W1 | | W2 | ... ... | Wn | |-----------------------+ +------------+
| +----+ +----+ +----+ | go channel notify
| Workers Pool |
| |
+-------------------------------------------------------+
^ | ^
| | |
| v | Return Found Result
| +---------------------+ |
| | | |
| | in-memory cache |------+
+------------->| |
| +---------------------+
| Write to cache |
| | Not Found
| |
| +-----------v----------+
| | |
+-------------| Crawler |
Return Crawler Result | |
+----------------------+
^ ^
/ \ Sync. Wait
/ \
v v
+---------------+ +------------+
| | | |
| Watsons | | Poya |
| | | |
+---------------+ +------------+
## RESTful APIs
* CRUD
| Method | URL | Description |
|-------------|-------------|-------------|
| GET | | Query product name's price for all drugstore shop without sorting. |
| GET | | Query product name's price for all drugstore shop with sorting (ascending order). |
| GET | | Query product name's price for all drugstore shop with sorting (descending order). |* HTTP Response JSON arry with JSON format
| Field | Type(Length) | Description |
|--------------|--------------|--------------|
| shop | String(16) | Shop name |
| name | String(128) | Product name |
| price | Integer | Product price|## TODO
* [ ] Support crawling cosmed HTML content
* [ ] Support [prometheus](https://prometheus.io) metrics API
* [ ] Analyze in-memory cache hit rate and also analyze the timeout threshold for in-memory cache