{"id":20260113,"url":"https://github.com/influxdata/hll-check","last_synced_at":"2026-05-08T21:32:56.988Z","repository":{"id":74459576,"uuid":"95769485","full_name":"influxdata/hll-check","owner":"influxdata","description":"A small tool for comparing HLL/HLL++ implementations","archived":false,"fork":false,"pushed_at":"2017-06-29T11:24:18.000Z","size":3,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-12-02T04:51:59.231Z","etag":null,"topics":["hyperloglog","influxdb","testing"],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/influxdata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-06-29T11:08:23.000Z","updated_at":"2017-06-29T11:09:56.000Z","dependencies_parsed_at":null,"dependency_job_id":"6e5607e5-00db-4a5b-b5f7-a9c55b831a8f","html_url":"https://github.com/influxdata/hll-check","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/influxdata/hll-check","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/influxdata%2Fhll-check","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/influxdata%2Fhll-check/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/influxdata%2Fhll-check/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/influxdata%2Fhll-check/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/influxdata","download_url":"https://codeload.github.com/influxdata/hll-check/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/influxdata%2Fhll-check/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32798374,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-08T08:22:46.396Z","status":"ssl_error","status_checked_at":"2026-05-08T08:22:45.650Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hyperloglog","influxdb","testing"],"created_at":"2024-11-14T11:18:01.460Z","updated_at":"2026-05-08T21:32:56.981Z","avatar_url":"https://github.com/influxdata.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"### HLL Check\n\nThis is a small program that can be used to compare two HLL / HLL++ implementations to each other across a set of different cardinality and duplicated data.\n\nAt the moment it just generates different data sets and compares the accuracy between two one or more implementations.\nIn the future I plan to provide benchmarks, which can be run on competing implementations, and more analysis of how errors change over time (as a dataset grows).\n\n#### Using the package\n\nTo use the package you need to write a small `main` program, which provides `hllcheck` with one or two factory functions for initialising new HLL/HLL++ implementations.\n\nA HLL/HLL++ implementation must satisfy the following interface:\n\n```go\ntype HLL interface {\n\tAdd(v []byte)\n\tCount() uint64\n}\n```\n\nA simple main program could look like:\n\n```go\npackage main\n\nimport (\n\t\"os\"\n\n\t\"github.com/other/repo/hll2\"\n\n\t\"github.com/influxdata/hll-check\"\n\t\"github.com/influxdata/influxdb/pkg/estimator/hll\"\n)\n\nfunc main() {\n\thllcheck.Seed = time.Now().Unix()\n\t// Existing implementation with precision 16.\n\th1f := func() hllcheck.HLL { return hll.MustNewPlus(16) }\n\t// Proposed alternative implementation with precision 16.\n\th2f := func() hllcheck.HLL { return hll2.New(16) }\n\n\t_ = hllcheck.Run(hllcheck.ToHLLFatory(h1f), hllcheck.ToHLLFatory(h2f), os.Stdout)\n}\n```\n\nIn this case the results will be printed to `stdout`. You could not ignore the returned value and instead inspect or do further analysis on the results yourself.\n\n\nThe current results for the version of HLL++ in InfluxDB `1.3` are:\n\n```\nSize\t\t\tActual Cardinality\tEstimation\t\tError (%)\t\tDuplication (%)\n500\t\t\t500\t\t\t500\t\t\t0.0000%\t\t\t0.00%\n500\t\t\t359\t\t\t360\t\t\t0.2778%\t\t\t28.20%\n500\t\t\t112\t\t\t113\t\t\t0.8850%\t\t\t77.60%\n1000\t\t\t1000\t\t\t1000\t\t\t0.0000%\t\t\t0.00%\n1000\t\t\t756\t\t\t756\t\t\t0.0000%\t\t\t24.40%\n1000\t\t\t199\t\t\t200\t\t\t0.5000%\t\t\t80.10%\n5000\t\t\t5000\t\t\t5000\t\t\t0.0000%\t\t\t0.00%\n5000\t\t\t3743\t\t\t3743\t\t\t0.0000%\t\t\t25.14%\n5000\t\t\t1005\t\t\t1005\t\t\t0.0000%\t\t\t79.90%\n10000\t\t\t10000\t\t\t10000\t\t\t0.0000%\t\t\t0.00%\n10000\t\t\t7467\t\t\t7467\t\t\t0.0000%\t\t\t25.33%\n10000\t\t\t1976\t\t\t1977\t\t\t0.0506%\t\t\t80.24%\n100000\t\t\t100000\t\t\t100123\t\t\t0.1228%\t\t\t0.00%\n100000\t\t\t74895\t\t\t74973\t\t\t0.1040%\t\t\t25.11%\n100000\t\t\t19895\t\t\t19894\t\t\t-0.0050%\t\t80.11%\n250000\t\t\t250000\t\t\t249427\t\t\t-0.2297%\t\t0.00%\n250000\t\t\t187589\t\t\t187566\t\t\t-0.0123%\t\t24.96%\n250000\t\t\t50072\t\t\t49783\t\t\t-0.5805%\t\t79.97%\n500000\t\t\t500000\t\t\t499968\t\t\t-0.0064%\t\t0.00%\n500000\t\t\t374736\t\t\t375053\t\t\t0.0845%\t\t\t25.05%\n500000\t\t\t100410\t\t\t100534\t\t\t0.1233%\t\t\t79.92%\n1000000\t\t\t1000000\t\t\t1002466\t\t\t0.2460%\t\t\t0.00%\n1000000\t\t\t749999\t\t\t749239\t\t\t-0.1014%\t\t25.00%\n1000000\t\t\t200620\t\t\t200144\t\t\t-0.2378%\t\t79.94%\n5000000\t\t\t5000000\t\t\t5009290\t\t\t0.1855%\t\t\t0.00%\n5000000\t\t\t3749466\t\t\t3760525\t\t\t0.2941%\t\t\t25.01%\n5000000\t\t\t1000467\t\t\t1003146\t\t\t0.2671%\t\t\t79.99%\n25000000\t\t25000000\t\t24994384\t\t-0.0225%\t\t0.00%\n25000000\t\t18749778\t\t18721102\t\t-0.1532%\t\t25.00%\n25000000\t\t5001941\t\t\t5011561\t\t\t0.1920%\t\t\t79.99%\n100000000\t\t100000000\t\t99684277\t\t-0.3167%\t\t0.00%\n100000000\t\t75001559\t\t75072214\t\t0.0941%\t\t\t25.00%\n500000000\t\t500000000\t\t500475904\t\t0.0951%\t\t\t0.00%\n500000000\t\t374990807\t\t374916524\t\t-0.0198%\t\t25.00%\n\n\n\nMean Error\t\tMedian Error\t\tError Variance\t\tMax Error\n0.0540%\t\t\t0.0000%\t\t\t0.0574\t\t\t0.8850%\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finfluxdata%2Fhll-check","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finfluxdata%2Fhll-check","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finfluxdata%2Fhll-check/lists"}