https://github.com/phstudy/zetasketch-bigquery-example
An example demonstrates how to use ZetaSketch with BigQuery
https://github.com/phstudy/zetasketch-bigquery-example
bigquery hll java zetasketch
Last synced: 11 days ago
JSON representation
An example demonstrates how to use ZetaSketch with BigQuery
- Host: GitHub
- URL: https://github.com/phstudy/zetasketch-bigquery-example
- Owner: phstudy
- Created: 2020-01-16T07:20:29.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2020-01-20T09:01:09.000Z (over 6 years ago)
- Last Synced: 2025-01-21T02:41:37.276Z (over 1 year ago)
- Topics: bigquery, hll, java, zetasketch
- Language: Java
- Homepage:
- Size: 53.7 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Example ZetaSketch with BigQuery
=================================
An Example demonstrates how to use [ZetaSketch](https://github.com/google/zetasketch) with [BigQuery](https://cloud.google.com/bigquery/docs/reference/standard-sql/hll_functions).
# How to run
```bash
$ git clone https://github.com/phstudy/zetasketch-bigquery-example.git
$ cd zetasketch-bigquery-example
$ GOOGLE_APPLICATION_CREDENTIALS=/path/to/my/key.json ./gradlew run
```
# Sample code
### Generate ZetaSketch HLL
```java
HyperLogLogPlusPlus hll = new HyperLogLogPlusPlus.Builder().buildForStrings();
hll.add("apple");
hll.add("orange");
hll.add("banana");
```
### Generate HLL in BigQuery and deserialize as ZetaSketch HLL object
```java
String sql = //
"SELECT"
+ " HLL_COUNT.INIT(fruit) AS fruit_hll"
+ " FROM UNNEST(['apple', 'orange', 'banana']) AS fruit";
TableResult result = queryBigQuery(sql);
FieldValueList row = result.getValues().iterator().next();
byte[] hllBytes = row.get("fruit_hll").getBytesValue();
HyperLogLogPlusPlus rst = (HyperLogLogPlusPlus) HyperLogLogPlusPlus.forProto(hllBytes);
```
### Serialize ZetaSketch HLL object and calculate cardinality in BigQuery
```java
String base64Str = Base64.getEncoder().encodeToString(hll.serializeToByteArray());
String sql = "SELECT HLL_COUNT.EXTRACT(FROM_BASE64('" + base64Str + "')) AS fruit_cnt";
TableResult result = queryBigQuery(sql);
FieldValueList row = result.getValues().iterator().next();
long cnt = row.get("fruit_cnt").getLongValue();
```