Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/phadej/tdigest
On-line accumulation of rank-based statistics such as quantiles and trimmed means
https://github.com/phadej/tdigest
Last synced: 2 months ago
JSON representation
On-line accumulation of rank-based statistics such as quantiles and trimmed means
- Host: GitHub
- URL: https://github.com/phadej/tdigest
- Owner: phadej
- Created: 2016-11-01T23:32:28.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2023-11-21T21:03:08.000Z (about 1 year ago)
- Last Synced: 2024-03-26T01:23:03.070Z (10 months ago)
- Language: Haskell
- Size: 375 KB
- Stars: 30
- Watchers: 3
- Forks: 7
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
Awesome Lists containing this project
README
# tdigest
A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means.
See original paper: ["Computing extremely accurate quantiles using t-digest"](https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf) by Ted Dunning and Otmar Ertl
## Synopsis
```hs
λ *Data.TDigest > median (tdigest [1..1000] :: TDigest 3)
Just 499.0090729817737
```## Benchmarks
Using 50M exponentially distributed numbers:
- average: **16s**; incorrect approximation of median, mostly to measure prng speed
- sorting using `vector-algorithms`: **33s**; using 1000MB of memory
- sparking t-digest (using some `par`): **53s**
- buffered t-digest: **68s**
- sequential t-digest: **65s**## Example histogram
```
tdigest-simple -m tdigest -d standard -s 100000 -c 10 -o output.svg -i 34
cp output.svg example.svg
inkscape --export-png=example.png --export-dpi=80 --export-background-opacity=0 --without-gui example.svg
```![Example](https://raw.githubusercontent.com/futurice/haskell-tdigest/master/tdigest/example.png)