{"id":19935656,"url":"https://github.com/gameanalytics/hyper","last_synced_at":"2025-08-25T22:31:48.621Z","repository":{"id":8513340,"uuid":"10125873","full_name":"GameAnalytics/hyper","owner":"GameAnalytics","description":"Erlang implementation of HyperLogLog","archived":true,"fork":false,"pushed_at":"2024-06-26T16:08:34.000Z","size":349,"stargazers_count":95,"open_issues_count":5,"forks_count":26,"subscribers_count":26,"default_branch":"master","last_synced_at":"2025-02-20T05:41:57.888Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Erlang","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GameAnalytics.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2013-05-17T15:06:41.000Z","updated_at":"2024-12-17T10:12:27.000Z","dependencies_parsed_at":"2024-12-16T10:39:05.748Z","dependency_job_id":null,"html_url":"https://github.com/GameAnalytics/hyper","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/GameAnalytics/hyper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GameAnalytics%2Fhyper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GameAnalytics%2Fhyper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GameAnalytics%2Fhyper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GameAnalytics%2Fhyper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GameAnalytics","download_url":"https://codeload.github.com/GameAnalytics/hyper/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GameAnalytics%2Fhyper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":272143666,"owners_count":24881136,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-25T02:00:12.092Z","response_time":1107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T23:21:13.843Z","updated_at":"2025-08-25T22:31:48.233Z","avatar_url":"https://github.com/GameAnalytics.png","language":"Erlang","readme":"# HyperLogLog for Erlang\n\nThis is an implementation of the HyperLogLog algorithm in\nErlang. Using HyperLogLog you can estimate the cardinality of very\nlarge data sets using constant memory. The relative error is `1.04 *\nsqrt(2^P)`. When creating a new HyperLogLog filter, you provide the\nprecision P, allowing you to trade memory for accuracy. The union of\ntwo filters is lossless.\n\nIn practice this allows you to build efficient analytics systems. For\nexample, you can create a new filter in each mapper and feed it a\nportion of your dataset while the reducers simply union together all\nfilters they receive. The filter you end up with is exactly the same\nfilter as if you would sequentially insert all data into a single\nfilter.\n\nIn addition to the base algorithm, we have implemented the bias\ncorrection from HLL++ as the described in the excellent\n[paper by Google][]. Bias correction greatly improves the estimates\nfor lower cardinalities.\n\n\n## Usage\n\n```erlang\n1\u003e hyper:insert(\u003c\u003c\"foobar\"\u003e\u003e, hyper:insert(\u003c\u003c\"quux\"\u003e\u003e, hyper:new(4))).\n{hyper,4,\n       {hyper_binary,{dense,\u003c\u003c0,0,0,0,0,0,0,0,64,0,0,0\u003e\u003e,\n                            [{8,1}],\n                            1,16}}}\n\n2\u003e hyper:card(v(-1)).\n2.136502281992361\n```\n\nThe errors introduced by estimations can be seen in this example:\n```erlang\n3\u003e random:seed(1,2,3).\nundefined\n4\u003e Run = fun (P, Card) -\u003e hyper:card(lists:foldl(fun (_, H) -\u003e Int = random:uniform(10000000000000), hyper:insert(\u003c\u003cInt:64/integer\u003e\u003e, H) end, hyper:new(P), lists:seq(1, Card))) end.\n#Fun\u003cerl_eval.12.80484245\u003e\n5\u003e Run(12, 10000).\n9992.846462080579\n6\u003e Run(14, 10000).\n10055.568563614219\n7\u003e Run(16, 10000).\n10007.654167606248\n```\n\nA filter can be persisted and read later. The serialized struct is formatted for usage with jiffy:\n```erlang\n8\u003e Filter = hyper:insert(\u003c\u003c\"foo\"\u003e\u003e, hyper:new(4)).\n{hyper,4,\n       {hyper_binary,{dense,\u003c\u003c4,0,0,0,0,0,0,0,0,0,0,0\u003e\u003e,[],0,16}}}\n9\u003e Filter =:= hyper:from_json(hyper:to_json(Filter)).\ntrue\n```\n\nYou can select a different backend. See below for a description of why\nyou might want to do so. They serialize in exactly the same way, but\ncan't be mixed in memory.\n\n```erlang\n1\u003e Gb = hyper:insert(\u003c\u003c\"foo\"\u003e\u003e, hyper:new(4, hyper_gb)).\n{hyper,4,{hyper_gb,{{1,{0,1,nil,nil}},16}}}\n2\u003e B = hyper:insert(\u003c\u003c\"foo\"\u003e\u003e, hyper:new(4, hyper_binary)).\n{hyper,4,\n       {hyper_binary,{dense,\u003c\u003c4,0,0,0,0,0,0,0,0,0,0,0\u003e\u003e,[],0,16}}}\n3\u003e hyper:to_json(Gb) =:= hyper:to_json(B).\ntrue\n4\u003e hyper:union(Gb, B).\n** exception error: no case clause matching [{4,hyper_binary},{4,hyper_gb}]\n     in function  hyper:union/1 (src/hyper.erl, line 65)\n```\n\n\n## Is it any good?\n\nYes. At Game Analytics we use it extensively.\n\n## Backends\n\nEffort has been spent on implementing different backends in the\npursuit of finding the right performance trade-off. The estimate will\nalways be the same, regardless of backend. A simple performance\ncomparison can be seen by running `make perf_report`, see below for\nthe results from an i7-3770 at 3.4 GHz. Fill rate refers to how many\nregisters has a value other than 0.\n\n * `hyper_binary`: Fixed memory usage (6 bits * 2^P), fastest on insert,\n   union, cardinality and serialization. Best default choice.\n\n * `hyper_bisect`: Lower memory usage at lower fill rates (3 bytes per\n   used entry), slightly slower than hyper_binary for\n   everything. Switches to a structure similar to hyper_binary when it\n   would save memory. Room for further optimization.\n\n * `hyper_gb`: Fast inserts, very fast unions and reasonable memory\n   usage at low fill rates. Unreasonable memory usage at high fill\n   rates.\n\n * `hyper_array`: Cardinality estimation is constant, but slower than\n   hyper_gb for low fill rates. Uses much more memory at lower fill\n   rates, but stays constant from 25% and upwards.\n\n * `hyper_binary_rle`: Dud\n\nYou can also implement your own backend. In `hyper_test` theres a\nbunch of tests run for all backends, including some PropEr tests. The\ntest suite will ensure your backend gives correct estimates and\ncorrectly encodes/decodes the serialized filters.\n\n\n\n```\n$ make perf_report\n...\n\nmodule       P        card   fill      bytes  insert us   union ms    card ms    json ms\nhyper_gb     15          1   0.00         64     301.90       0.00       0.10       2.69\nhyper_gb     15        100   0.00       3984       1.34       0.05       0.05       6.13\nhyper_gb     15        500   0.02      19784       1.53       0.27       0.12       8.67\nhyper_gb     15       1000   0.03      39384       1.72       0.53       0.19       8.67\nhyper_gb     15       2500   0.07      96384       1.84       1.49       0.40      10.67\nhyper_gb     15       5000   0.14     185224       1.99       3.24       0.71      12.22\nhyper_gb     15      10000   0.26     344464       2.10       6.80       1.31      15.09\nhyper_gb     15      15000   0.37     481664       2.02      10.07       1.91      17.99\nhyper_gb     15      25000   0.53     698344       2.08      16.42       2.67      18.71\nhyper_gb     15      50000   0.78    1027504       1.96      31.49       4.29      18.87\nhyper_gb     15     100000   0.95    1248144       1.78      49.90       5.20      17.92\nhyper_gb     15    1000000   1.00    1310744       1.10     108.38       5.72      18.96\nhyper_array  15          1   0.00        520       4.10       0.00       4.74       3.83\nhyper_array  15        100   0.00      19536       1.56       0.10       4.71       3.99\nhyper_array  15        500   0.02      69328       1.44       0.43       4.63       4.46\nhyper_array  15       1000   0.03     107760       1.51       0.79       4.68       4.29\nhyper_array  15       2500   0.07     188384       1.35       1.79       4.70       5.16\nhyper_array  15       5000   0.14     261520       1.31       3.27       4.72       5.33\nhyper_array  15      10000   0.26     308072       1.21       5.45       4.99       6.90\nhyper_array  15      15000   0.37     320128       1.22       7.34       4.96       7.72\nhyper_array  15      25000   0.53     323384       1.21      11.07       5.18       8.42\nhyper_array  15      50000   0.78     323560       1.04      17.90       5.51       8.08\nhyper_array  15     100000   0.95     323560       1.00      26.93       5.60       7.70\nhyper_array  15    1000000   1.00     323560       0.72      51.65       5.77       7.77\nhyper_bisect 15          1   0.00          3       6.10       0.00       0.05       7.26\nhyper_bisect 15        100   0.00        297       1.62       0.08       0.12      17.42\nhyper_bisect 15        500   0.02       1482       1.72       0.42       0.59      21.30\nhyper_bisect 15       1000   0.03       2952       1.83       0.95       1.10      23.21\nhyper_bisect 15       2500   0.07       7227       1.97       2.63       2.53      25.91\nhyper_bisect 15       5000   0.14      13890       2.23       6.50       5.06      28.80\nhyper_bisect 15      10000   0.26      25833       2.74      18.21       9.18      31.61\nhyper_bisect 15      15000   0.37      32768       3.67      33.29       4.40       4.46\nhyper_bisect 15      25000   0.53      32768       3.11      78.77       4.60       5.17\nhyper_bisect 15      50000   0.78      32768       2.53     190.66       4.90       4.89\nhyper_bisect 15     100000   0.95      32768       2.05     381.11       5.07       4.43\nhyper_bisect 15    1000000   1.00      32768       0.85      27.17       5.65       4.59\nhyper_binary 15          1   0.00         88       3.40       0.00       6.06       2.23\nhyper_binary 15        100   0.00       4048       0.63       0.01       5.91       2.38\nhyper_binary 15        500   0.02      20048       0.50       0.01       5.67       2.58\nhyper_binary 15       1000   0.02      24576       2.13       0.00       2.72       1.33\nhyper_binary 15       2500   0.07      24576       1.54       1.97       2.72       1.95\nhyper_binary 15       5000   0.12      24576       1.23       2.40       2.71       2.77\nhyper_binary 15      10000   0.26      24576       1.10      11.16       2.95       4.46\nhyper_binary 15      15000   0.34      24576       1.11      12.30       2.75       5.48\nhyper_binary 15      25000   0.50      24576       0.95      11.97       2.79       5.83\nhyper_binary 15      50000   0.76      24576       0.92      13.55       2.81       5.65\nhyper_binary 15     100000   0.95      24576       0.79      11.74       2.59       5.16\nhyper_binary 15    1000000   1.00      24576       0.55      13.88       2.64       5.11\n```\n\n[paper by Google]: http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/40671.pdf\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgameanalytics%2Fhyper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgameanalytics%2Fhyper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgameanalytics%2Fhyper/lists"}