Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/emaadmanzoor/streaming-unique-counting
Approximately counting unique items in a stream. Algorithms course project at KAUST.
https://github.com/emaadmanzoor/streaming-unique-counting
Last synced: about 1 month ago
JSON representation
Approximately counting unique items in a stream. Algorithms course project at KAUST.
- Host: GitHub
- URL: https://github.com/emaadmanzoor/streaming-unique-counting
- Owner: emaadmanzoor
- Created: 2013-11-22T12:39:45.000Z (about 11 years ago)
- Default Branch: master
- Last Pushed: 2015-03-03T05:53:27.000Z (almost 10 years ago)
- Last Synced: 2024-04-16T01:44:40.501Z (8 months ago)
- Language: Python
- Size: 1.9 MB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Streaming Unique Counting
*Algorithms, KAUST, Fall 2013 - with Prof. Mikhail Moshkov.*Implementations of HyperLogLog and adaptive sampling for approximate counting of unique items in a stream. A [report](www.eyeshalfclosed.com/docs/CS260_Final_Report.pdf) describing some interesting empirical comparision results is also available.
## Quickstart
```
pip install mmh3
git clone [email protected]:emaadmanzoor/streaming-unique-counting.git
cd streaming-unique-countingpython cardinality_estimation.py "test_data/wuthering_heights.txt" 0 exact
python cardinality_estimation.py "test_data/wuthering_heights.txt" 0 adaptive 1150python cardinality_estimation.py "test_data/big_test/" 1 exact
python cardinality_estimation.py "test_data/big_test/" 1 adaptive 5000
```## Contributors
* [Emaad Ahmed Manzoor](http://eyeshalfclosed.com)
* Tariq Alturkestani
* Jumana Baghabra
* Fatemah Alzayer
* Meshari Alazmi