Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/andrewdalpino/pybloomer
OkBloomer, a novel autoscaling bloom filter with ultra-low memory footprint, now in Python.
https://github.com/andrewdalpino/pybloomer
Last synced: about 1 month ago
JSON representation
OkBloomer, a novel autoscaling bloom filter with ultra-low memory footprint, now in Python.
- Host: GitHub
- URL: https://github.com/andrewdalpino/pybloomer
- Owner: andrewdalpino
- License: mit
- Created: 2024-10-12T00:29:15.000Z (3 months ago)
- Default Branch: master
- Last Pushed: 2024-10-29T06:15:17.000Z (2 months ago)
- Last Synced: 2024-10-29T06:29:40.154Z (2 months ago)
- Language: Python
- Size: 24.4 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Ok Bloomer
An implementation of the OkBloomer algorithm, an autoscaling [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) with ultra-low memory footprint for Python. Ok Bloomer employs a novel layered filtering strategy that allows it to expand while maintaining an upper bound on the false positive rate. As such, Ok Bloomer is suitable for streaming data where the size is not known a priori.- **Ultra-low** memory footprint
- **Autoscaling** works on streaming data
- **Bounded** maximum false positive rate
- **Open-source** and free to use commercially## Installation
Install DNA Hash using a Python [package manager](https://packaging.python.org/en/latest/tutorials/installing-packages/), example pip:```
pip install okbloomer
```## Parameters
| # | Name | Default | Type | Description |
|---|---|---|---|---|
| 1 | max_false_positive_rate | 0.01 | float | The upper bound on the false positivity rate. |
| 2 | num_hashes | 4 | int | The number of hash functions used, i.e. the number of slices per layer. |
| 3 | layer_size | 32000000 | int | The size of each layer of the filter in bits. Ideal sizes can be divided evenly by `num_hashes`.|## Example Usage
```python
import okbloomerfilter = okbloomer.BloomFilter(
max_false_positive_rate=0.01,
num_hashes=4,
layer_size=32000000,
)filter.insert('foo')
print(filter.exists('foo'))
print(filter.existsOrInsert('bar'))
print(filter.exists('bar'))
print(filter.false_positive_rate())
``````
TrueFalse
True
3.906249999999999e-27
```## References
- [1] A. DalPino. (2021). OkBloomer, a novel autoscaling Bloom Filter [[link](https://github.com/andrewdalpino/OkBloomer)].
- [2] K. Christensen, et al. A New Analysis of the False-Positive Rate of a Bloom Filter.