https://github.com/ct-clmsn/chiuw2017
sketching algorithms implemented in chapel and python
https://github.com/ct-clmsn/chiuw2017
chapel sketching-algorithms
Last synced: 5 months ago
JSON representation
sketching algorithms implemented in chapel and python
- Host: GitHub
- URL: https://github.com/ct-clmsn/chiuw2017
- Owner: ct-clmsn
- Created: 2017-06-08T01:56:34.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-06-08T02:20:39.000Z (over 8 years ago)
- Last Synced: 2025-04-01T06:17:23.717Z (7 months ago)
- Topics: chapel, sketching-algorithms
- Language: Chapel
- Homepage:
- Size: 22.5 KB
- Stars: 10
- Watchers: 2
- Forks: 1
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
CHIUW2017
# chpl-sketching
Sketching algorithms implemented in Chapel[Sketch Origins](https://datasketches.github.io/docs/SketchOrigins.html):
Sketching is a relatively recent development in the theoretical field of
Stochastic Streaming Algorithms, which deals with algorithms that can extract
information from a stream of data in a single pass (sometimes called
“one-touch” processing) using various randomization techniques.## HyperLogLog
[HyperLogLog on Wikipedia](https://en.wikipedia.org/wiki/HyperLogLog):
HyperLogLog is an algorithm for the count-distinct problem, approximating the
number of distinct elements in a multiset
...
The HyperLogLog algorithm can estimate cardinalities well beyond 10^9 with a
relative accuracy (standard error) of 2% while only using 1.5kb of memory.[count-distinct problem on Wikipedia](https://en.wikipedia.org/wiki/Count-distinct_problem):
count-distinct problem (also known in applied mathematics as the cardinality
estimation problem) is the problem of finding the number of distinct elements
in a data stream with repeated elements.[Cardinality Estimation for Big Data](http://druid.io/blog/2012/05/04/fast-cheap-and-98-right-cardinality-estimation-for-big-data.html):
HyperLogLog takes advantage of the randomized distribution of bits from hashing
functions in order to estimate how many things you would’ve needed to see in
order to experience a specific phenomenon.