{"id":33933371,"url":"https://github.com/0xchl0e/oomfi","last_synced_at":"2026-03-09T18:40:17.418Z","repository":{"id":166423149,"uuid":"640452353","full_name":"0xchl0e/oomfi","owner":"0xchl0e","description":"A bloom filter implemented in pure Rust","archived":false,"fork":false,"pushed_at":"2023-08-30T04:22:20.000Z","size":38,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-13T20:57:24.050Z","etag":null,"topics":["bloomfilter","bloomfilter-rust","probabilistic-data-structures","probabilistic-programming","rust","set"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/0xchl0e.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"license","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-14T06:11:30.000Z","updated_at":"2023-05-20T11:12:26.000Z","dependencies_parsed_at":null,"dependency_job_id":"bdd2c18f-4372-4d56-98f0-ed59ecc84dd3","html_url":"https://github.com/0xchl0e/oomfi","commit_stats":null,"previous_names":["chloe0x0/oomfi","bunnybrooke/oomfi","0xem1ly/oomfi"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/0xchl0e/oomfi","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xchl0e%2Foomfi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xchl0e%2Foomfi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xchl0e%2Foomfi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xchl0e%2Foomfi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/0xchl0e","download_url":"https://codeload.github.com/0xchl0e/oomfi/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xchl0e%2Foomfi/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30307549,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-09T17:35:44.120Z","status":"ssl_error","status_checked_at":"2026-03-09T17:35:43.707Z","response_time":61,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bloomfilter","bloomfilter-rust","probabilistic-data-structures","probabilistic-programming","rust","set"],"created_at":"2025-12-12T13:03:43.918Z","updated_at":"2026-03-09T18:40:17.410Z","avatar_url":"https://github.com/0xchl0e.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align='center'\u003e\n    oomfi 🌸\n\u003c/h1\u003e\n\n\u003cp align='center'\u003e\n    A small, 🔥 *blazingly fast* 🔥, bloom filter implemented in Rust (yes, another one).\n\u003c/p\u003e\n\n## The Data Structure\n\nA bloom filter is a probabalistic data structure used for fast, memory efficient representations of Sets. \n\nMore specifically, a bloom filter is a vector of M bits. Elements are added to the set by putting them through K hash functions which map the element to an index in the bit vector. All of these bits are then set to 1. To query for set membership, pass the element through the hash functions and if any of the bits at the hashed indecies are 0 we know the element cannot be a member of the set. Otherwise, it is *probably* an element of the set. There is a non-zero (though usually small) probability of a false positive (An element is said to be in the set when it was never inserted).\n\n## Optimal number of hash functions and bits\n\nI wont go over the derivation (though it is pretty trivial) \n\nfor a desired false positivity rate $\\epsilon \\in (0, 1)$\nand an expected capacity of n elements, \n\nthe optimal number of hash functions is given as $-\\frac{ln(\\epsilon)}{ln(2)}$\nand the optimal number of bits is $-\\frac{nln(\\epsilon)}{(ln(2))^2}$\n\nIn oomfi only 2 hash functions are used! This is because of an efficient scheme proposed by [Kirsch and Mitzenmacher](https://www.eecs.harvard.edu/~michaelm/postscripts/rsa2008.pdf) in which 2 hash functions can be combined to construct k hash functions. \n$$g_i(x) = h_1(x) + ih_2(x)$$\n$$i \\in [0, k)$$\n\nThis is a significant speedup compared to using K seperate hash functions! With this idea oomfi only really computes 2 hash values using AHash for query/ insertion calls regardless of the number of hash functions. It was proven to not impact the asymptotic false positive rate of a standard bloom filter. \n\n# Getting Started\n\n### Dependencies\noomfi currently only depends on the [bitvec](https://crates.io/crates/bitvec) crate, and [ahash](https://crates.io/crates/ahash).\n(future releases may depend on Serde for serialization of Bloom Filters)\n\n## Installation\n\neither run\n```console\nλ \u003e\u003e\u003e cargo add oomfi\n```\nin your project directory\n\nor add the following line to the dependencies in your cargo.toml\n```toml\noomfi = \"0.1.2\"\n```\n\nto test that everything works\n\n```console\nλ \u003e\u003e\u003e cargo test\n```\n\n## Usage\n\nLets represent the set {:3,uwu,owo}\nwith a false positivity rate of ~1%\n\n```rust\nuse oomfi::*;\n\nfn main() {\n    // Lets representing the set {:3, uwu, owo}\n    // n=3, false positive rate of 1%\n    // uses the optimal number of hash functions and bits\n    // It can store any datatype which implements Hash\n    let mut set = Bloom::new(3, 0.01);\n    // Insert elements into the set\n    set.insert(\":3\");\n    set.insert(\"uwu\");\n    set.insert(\"owo\");\n    // Assert that the elements are in the set\n    assert!(set.query(\":3\"));\n    assert!(set.query(\"uwu\"));\n    assert!(set.query(\"owo\"));\n    // This should only fail ~1% of the time ^_^\n    assert_ne!(set.query(\"OWO\"), true);\n    // Clear the set\n    set.clear();\n    // Assert that the set's previous elements are properly removed\n    assert_ne!(set.query(\":3\"), true);\n    assert_ne!(set.query(\"uwu\"), true);\n    assert_ne!(set.query(\"owo\"), true);\n}\n```\n\nusing a custom type\n\n```rust\nuse oomfi::*;\n\n#[derive(Clone, Copy)]\nstruct Mage\u003c'a\u003e {\n    name: \u0026'a str,\n    level: u64,\n    mana: f64\n}\n\nimpl Hash for Mage\u003c'_\u003e {\n    fn hash\u003cH: std::hash::Hasher\u003e(\u0026self, state: \u0026mut H) {\n        self.name.hash(state);\n        self.level.hash(state);\n    }\n}\n\nfn main() {\n    let Malori = Mage {name: \u0026\"Malori\", level: u64::MAX, mana: f64::MAX};\n\n    let set = Bloom::new(3, 0.01);\n    set.insert(Malori);\n    assert!(set.query(\u0026Malori));\n}\n```\n\nto check if a set is the empty set\n```rust\nif set.is_empty() {\n    println!(\"set = ∅\");\n} else {\n    println!(\"set != ∅\");\n}\n```\n\nto get the number of hash functions used\n\n```rust\nlet hashes_used: u64 = set.hash_functions();\n```\nand the number of bits used in the BitVector\n\n```rust\nlet bits: u64 = set.len();\n```\n\nto get a reference to the set's BitVec\n\n```rust\nlet bitvec: \u0026BitVec = set.get_vec();\n```\n\nto estimate the number of elements in a bloom filter\n\n```rust\nlet len: usize = set.estimate_len();\n```\n\nto insert all values from a type which implements the IntoIterator trait (and whose elements implement the Hash trait)\n\n```rust\nlet elements = vec![0, 1, 2, 3];\nset.insert_all(elements);\n// or\nlet elements = [0, 1, 2, 3];\nset.insert_all(elements);\n```\n\nto construct a Bloom Filter with an explicit number of hash functions\n\n```rust\n// Same set as the example, just with 7 hash functions\n// The optimal number of bits will be used\nlet set: Bloom = Bloom::with_k(7, 3, 0.01);\n```\nQ: Why would you want a number of hash functions which is not the optimal number? \nA: Performance. It can be possible to use less than the optimal number of hash functions and still get few false positives. This will consequently use less compute. \n\nto construct a Bloom Filter with an explicit number of bits\n\n```rust\n// Same set as the example, just with 100 bits\n// The optimal number of hash functions are used\nlet set: Boom = Bloom::with_m(100, 3, 0.01);\n```\nYou may use this for similar reasons as an explicit number of hash functions. It is possible to use less bits than the optimal amount to reduce memory usage. It is also possible to use more bits than the optimal amount.\n\nTo construct a Bloom Filter with an explicit number of bits and hash functions\n```rust\n// Set with 7 hash functions and 100 bits\nlet set: Bloom = Bloom::with_km(7, 100);\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F0xchl0e%2Foomfi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F0xchl0e%2Foomfi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F0xchl0e%2Foomfi/lists"}