https://github.com/zaghaghi/pdstl
Probabilistic Data Structure Template Library
https://github.com/zaghaghi/pdstl
big-data bloom-filter cpp14 datastructures probabilistic-data-structures quotient-filter template-library
Last synced: about 1 year ago
JSON representation
Probabilistic Data Structure Template Library
- Host: GitHub
- URL: https://github.com/zaghaghi/pdstl
- Owner: zaghaghi
- License: apache-2.0
- Created: 2020-05-24T20:13:01.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2020-06-18T20:32:32.000Z (almost 6 years ago)
- Last Synced: 2024-04-17T16:12:52.339Z (about 2 years ago)
- Topics: big-data, bloom-filter, cpp14, datastructures, probabilistic-data-structures, quotient-filter, template-library
- Language: C++
- Size: 168 KB
- Stars: 28
- Watchers: 3
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# PDSTL
[](https://pdstl.readthedocs.io/en/latest/?badge=latest)
[](https://travis-ci.org/zaghaghi/pdstl)
PDSTL, is a header only template library for probabilistic data structures.
# Build Example
To Build examples do as follows.
```bash
# Setup project with meson
meson build
# Build with ninja
ninja -C build -j 4
# Build docs
cd docs
make html
```
# Documentation
Read the reference documentations on [pdstl.readthedocs.io](https://pdstl.readthedocs.io/)
# Implemented Data Structures
## Membership
| Data Structure | Insert | Delete |
|-------------------------|------------|-----------------|
| Bloom Filter | Supported | Not Supported |
| Counting Bloom Filter | Supported | Supported |
| Quotient Filter | Supported | Not Implemented |
| Quotient Hash Table | Supported | Not Implemented |
| Cuckoo Filter | Supported | Supported |
## Cardinality
| Data Structure | Insert | Delete |
|--------------------------|------------|-----------------|
| Linear Counting | Supported | Not Supported |
| Flajolet–Martin Counting | Supported | Not Supported |
# References
* [Probabilistic Data Structures and Algorithms for Big Data Applications](https://pdsa.gakhov.com/) by Andrii Gakhov, 2019, ISBN: 978-3748190486 (paperback) ASIN: B07MYKTY8W (e-book)
* Fan, L., et al. (2000) “Summary cache: a scalable wide-area web cache sharing protocol”, Journal IEEE/ACM Transactions on Networking, Vol. 8 (3), pp. 281–293.
* Bender, M., et al. (2012) “Don’t Thrash: How to Cache your Hash on Flash”, Proceedings of the VLDB Endowment, Vol. 5 (11), pp. 1627–1637.
* Fan, B., et al. (2014) “Cuckoo Filter: Practically Better Than Bloom”, Proceedings of the 10th ACM International on Conference on emerging Networking Experiments and Technologies, Sydney, Australia — December 02–05, 2014, pp. 75–88, ACM New York, NY.
* Whang, K.-Y., Vander-Zanden, B.T., Taylor H.M. (1990) “A Linear-Time Probabilistic Counting Algorithm for Database Applications”, Journal ACM Transactions on Database Systems,
Vol. 15 (2), pp. 208–229.
* Flajolet, P., Martin, G.N. (1985) “Probabilistic Counting Algorithms for Data Base Applications”, Journal of Computer and System Sciences, Vol. 31 (2), pp. 182–209.