https://github.com/zircote/bloom
a drunken investigation into feasibility of pure php bloom filters
https://github.com/zircote/bloom
Last synced: about 1 year ago
JSON representation
a drunken investigation into feasibility of pure php bloom filters
- Host: GitHub
- URL: https://github.com/zircote/bloom
- Owner: zircote
- Created: 2012-06-30T15:59:58.000Z (almost 14 years ago)
- Default Branch: master
- Last Pushed: 2013-12-17T01:38:19.000Z (over 12 years ago)
- Last Synced: 2025-03-23T19:45:13.210Z (over 1 year ago)
- Language: PHP
- Size: 414 KB
- Stars: 15
- Watchers: 4
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE-2.0.txt
Awesome Lists containing this project
README
Bloom
=====
[](http://travis-ci.org/zircote/Bloom)
This project has morphed from an attempt of 'pure' php bloom filters to a tool
that employs `Redis` and `Rediska` as a storage and interface for a distributed
`Bloom Filter`.
By utilizing the power of redis' `SETBIT` and `GETBIT` we are able to create a
simple, powerful and lightweight bloomfilter. The ability of redis to create a
BITSET that is practically endless in size _512MB_ the index limitations afforded
us is huge. PHP_INT_MAX serves well as a line in the sand in this early stage of
testing and research. This potentially results in a 3% false positive rate for
10b values given sufficient number of hashing buckets. (I am performing more on
these numbers before I get overly confident on these findings.)
Read Times on Benchmarks: 0.0011499519820647/s with three key hashing using a
word list of 27607 unique strings.
Add Times on benchmark tests: 0.0036964981654479/s with three key hashing using
a word list of 27607 unique strings.
```php
setRediska(new Rediska());
/* Inject the Hash Class */
if(extension_loaded('murmurhash')){
$filter->setHash(new Murmur());
} else {
$filter->setHash(new HashMix());
}
/* Add items to the filter */
$filter->add('some random text');
$filter->add(array('or add','an array of elements'));
/* Check the filter */
var_dump($filter->contains('some random text'));
// bool(true)
var_dump($filter->contains('or add'));
// bool(true)
var_dump($filter->contains('This is not in the filter'));
// bool(false)
var_dump($filter->contains(array('or add','an array of elements')));
// bool(true)
var_dump($filter->contains(array('NO','or add','an array of elements')));
// bool(false)
```
[](https://bitdeli.com/free "Bitdeli Badge")