{"id":19055232,"url":"https://github.com/sangupta/bloomfilter","last_synced_at":"2025-04-24T04:21:25.397Z","repository":{"id":15360680,"uuid":"18091636","full_name":"sangupta/bloomfilter","owner":"sangupta","description":"Bloom filters for Java","archived":false,"fork":false,"pushed_at":"2023-12-10T10:47:26.000Z","size":78,"stargazers_count":62,"open_issues_count":3,"forks_count":16,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-18T12:18:39.229Z","etag":null,"topics":["bloom-filter","bloom-filters","java","murmur3"],"latest_commit_sha":null,"homepage":"http://sangupta.com/projects/bloomfilter","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sangupta.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2014-03-25T07:04:53.000Z","updated_at":"2025-02-08T14:57:29.000Z","dependencies_parsed_at":"2025-04-17T23:39:38.017Z","dependency_job_id":"b75f4982-6be7-45b6-ab2d-c13f683cd9dd","html_url":"https://github.com/sangupta/bloomfilter","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sangupta%2Fbloomfilter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sangupta%2Fbloomfilter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sangupta%2Fbloomfilter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sangupta%2Fbloomfilter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sangupta","download_url":"https://codeload.github.com/sangupta/bloomfilter/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250561344,"owners_count":21450405,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bloom-filter","bloom-filters","java","murmur3"],"created_at":"2024-11-08T23:43:13.986Z","updated_at":"2025-04-24T04:21:25.373Z","avatar_url":"https://github.com/sangupta.png","language":"Java","readme":"# bloomfilter\n\n[![Build Status](https://img.shields.io/travis/sangupta/bloomfilter.svg)](https://travis-ci.org/sangupta/bloomfilter)\n[![Coverage Status](https://img.shields.io/coveralls/sangupta/bloomfilter.svg)](https://coveralls.io/github/sangupta/bloomfilter?branch=master)\n[![License](https://img.shields.io/github/license/sangupta/bloomfilter.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Maven Central](https://img.shields.io/maven-central/v/com.sangupta/bloomfilter.svg)](https://maven-badges.herokuapp.com/maven-central/com.sangupta/bloomfilter)\n\n`bloomfilter` is a pure Java Bloom Filter implementation that provides simple persistable bloom filters. The\nentire bloom filter is abstracted into various layers so that the same can be changed by pure plug-and-play implementations\nsuch as decomposing an object to a byte-stream, or the hash function to be used, or the serialization strategy to\nbe used.\n\nThe library is unit-tested on the following platforms:\n\n* Oracle JDK 7\n* Oracle JDK 8\n* Oracle JDK 9\n\n## Why another Bloom Filter implementation?\n\n\n`bloomfilter` was developed as I was looking for a fast persistable bloom filter implementation that could\nbe customized to suit needs. The `Google Guava` bloom filter for few reasons cannot be persisted well, does not\nprovide for a disk-backed bit array implementation, missing a counting bloom filter and last not the least \nthe size of the payload. Many of my modules/projects did not need `Guava` and adding it just for using the \nbloom filter was coming out to be expensive. Thus, `bloomfilter` was born.\n\nThe `bloomfilter` is inspired by the `Guava` bloom filter implementation and uses a similar approach, with \nmore extension points baked in.\n\n## Features\n\n* Uses pure Java [murmur](https://github.com/sangupta/murmur) hash implementation as default hash function\n* Multiple persisting methodologies\n  * In-memory filter\n  * Java serialization disk filter\n  * Memory-mapped disk filter\n* Lightweight with no dependencies, 23KB size\n\n## Usage\n\n```java\n// the maximum number of elements that the filter will contain\nint numberOfElements = 1000 * 1000;\n\n// the max false positive probability that is desired\n// the lower the value - the more will be the memory usage\ndouble fpp = 0.01d;\n\n// this creates an in-memory bloom filter - useful when you need to dispose off the\n// filter at the end of application, and the memory consumption will not be too huge\nBloomFilter\u003cString\u003e filter = new InMemoryBloomFilter\u003cString\u003e(numberOfElements, fpp);\n\n// you can roll your own implementations based on file-backed, or memory-mapped \n// file-backed implementations that can provide persistence too\nfilter = new AbstractBloomFilter\u003cString\u003e(numberOfElements, fpp) {\n\n\t/**\n\t * Used a {@link FileBackedBitArray} to allow for file persistence.\n\t * \n\t * @returns a {@link BitArray} that will take care of storage of bloom filter\n\t */\n\t@Override\n\tprotected BitArray createBitArray(int numBits) {\n\t\treturn new FileBackedBitArray(new File(\"/tmp/test.bloom.filter\"), numBits);\n\t}\n\t\n};\n```\n\n`BitArray` is an interface defined in `com.sangupta.bloomfilter.core` package. This provides methods that\nany implementation can be provide and thus be used as a bloom-filter implementation. This allows for rolling\nout bloom filter implementations backed by file based persistence, Redis server or similar. The following\nimplementations are available for the `interface`:\n\n* FastBitArray - faster than the default Java one\n* JavaBitSetArray - uses Java BitSet as backing array\n* FileBackedBitArray - uses normal file backing object in random mode\n* MMapFileBackedBitArray - uses memory-mapped file, much faster than FileBackedBitArray\n\n\n## Builds\n\n**0.9.0** (17 Jun 2017)\n\n* First release with Murmur 1/2/3 hashes\n\n## Downloads\n\nThe library can be downloaded from Maven Central using:\n\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.sangupta\u003c/groupId\u003e\n    \u003cartifactId\u003ebloomfilter\u003c/artifactId\u003e\n    \u003cversion\u003e0.9.0\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n## Other Similar Projects\n\nOther similar bloom filter implementations include:\n\n### Google Guava\nRead more at http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/hash/BloomFilter.html\n\n* As explained before, is heavy.\n\n### Orestes-Bloomfilter\nhttps://github.com/DivineTraube/Orestes-Bloomfilter\n\n* Does not have a persisted version of a BloomFilter\n* Does not have a Murmur3 implementation\n\n### Greplin-bloom-filter \nhttps://github.com/Cue/greplin-bloom-filter\n\n* The persisted bloom filter does not use memory-mapped files, rather the slower file-seek-change-repeat workflow. \n* No Murmur3 implementation\n\n## Versioning\n\nFor transparency and insight into our release cycle, and for striving to maintain backward compatibility, \n`bloomfilter` will be maintained under the Semantic Versioning guidelines as much as possible.\n\nReleases will be numbered with the follow format:\n\n`\u003cmajor\u003e.\u003cminor\u003e.\u003cpatch\u003e`\n\nAnd constructed with the following guidelines:\n\n* Breaking backward compatibility bumps the major\n* New additions without breaking backward compatibility bumps the minor\n* Bug fixes and misc changes bump the patch\n\nFor more information on SemVer, please visit http://semver.org/.\n\n## License\n\n```\nbloomfilter: Bloom filters for Java\nCopyright (c) 2014-2018, Sandeep Gupta\n\nhttps://sangupta.com/projects/bloomfilter\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\n you may not use this file except in compliance with the License.\n You may obtain a copy of the License at\n \n      http://www.apache.org/licenses/LICENSE-2.0\n \n Unless required by applicable law or agreed to in writing, software\n distributed under the License is distributed on an \"AS IS\" BASIS,\n WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n See the License for the specific language governing permissions and\n limitations under the License.\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsangupta%2Fbloomfilter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsangupta%2Fbloomfilter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsangupta%2Fbloomfilter/lists"}