https://github.com/ct-clmsn/chiuw2017

sketching algorithms implemented in chapel and python
https://github.com/ct-clmsn/chiuw2017

chapel sketching-algorithms

Last synced: 5 months ago
JSON representation

sketching algorithms implemented in chapel and python

Host: GitHub
URL: https://github.com/ct-clmsn/chiuw2017
Owner: ct-clmsn
Created: 2017-06-08T01:56:34.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2017-06-08T02:20:39.000Z (over 8 years ago)
Last Synced: 2025-04-01T06:17:23.717Z (7 months ago)
Topics: chapel, sketching-algorithms
Language: Chapel
Homepage:
Size: 22.5 KB
Stars: 10
Watchers: 2
Forks: 1
Open Issues: 3
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

CHIUW2017

# chpl-sketching
Sketching algorithms implemented in Chapel

[Sketch Origins](https://datasketches.github.io/docs/SketchOrigins.html):

Sketching is a relatively recent development in the theoretical field of
Stochastic Streaming Algorithms, which deals with algorithms that can extract
information from a stream of data in a single pass (sometimes called
“one-touch” processing) using various randomization techniques.

## HyperLogLog

[HyperLogLog on Wikipedia](https://en.wikipedia.org/wiki/HyperLogLog):

HyperLogLog is an algorithm for the count-distinct problem, approximating the
number of distinct elements in a multiset
...
The HyperLogLog algorithm can estimate cardinalities well beyond 10^9 with a
relative accuracy (standard error) of 2% while only using 1.5kb of memory.

[count-distinct problem on Wikipedia](https://en.wikipedia.org/wiki/Count-distinct_problem):

count-distinct problem (also known in applied mathematics as the cardinality
estimation problem) is the problem of finding the number of distinct elements
in a data stream with repeated elements.

[Cardinality Estimation for Big Data](http://druid.io/blog/2012/05/04/fast-cheap-and-98-right-cardinality-estimation-for-big-data.html):

HyperLogLog takes advantage of the randomized distribution of bits from hashing
functions in order to estimate how many things you would’ve needed to see in
order to experience a specific phenomenon.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ct-clmsn/chiuw2017

Awesome Lists containing this project

README