{"id":47857738,"url":"https://github.com/rmm5t/bloom_fit","last_synced_at":"2026-04-09T04:01:02.924Z","repository":{"id":348810717,"uuid":"1199974305","full_name":"rmm5t/bloom_fit","owner":"rmm5t","description":"Bloom filters for Ruby with automatic sizing and a fast native in-memory core, with a small, Set-like API.","archived":false,"fork":false,"pushed_at":"2026-04-06T19:22:25.000Z","size":213,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-07T02:03:11.051Z","etag":null,"topics":["bloom-filter","ruby"],"latest_commit_sha":null,"homepage":"https://rubygems.org/gems/bloom_fit","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rmm5t.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":["rmm5t"],"custom":"https://www.paypal.me/rmm5t/5"}},"created_at":"2026-04-02T22:48:45.000Z","updated_at":"2026-04-06T19:22:27.000Z","dependencies_parsed_at":"2026-04-07T02:00:28.306Z","dependency_job_id":null,"html_url":"https://github.com/rmm5t/bloom_fit","commit_stats":null,"previous_names":["rmm5t/bloom_fit"],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/rmm5t/bloom_fit","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rmm5t%2Fbloom_fit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rmm5t%2Fbloom_fit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rmm5t%2Fbloom_fit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rmm5t%2Fbloom_fit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rmm5t","download_url":"https://codeload.github.com/rmm5t/bloom_fit/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rmm5t%2Fbloom_fit/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31537791,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-07T16:28:08.000Z","status":"online","status_checked_at":"2026-04-08T02:00:06.127Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bloom-filter","ruby"],"created_at":"2026-04-03T23:00:29.837Z","updated_at":"2026-04-08T03:01:18.142Z","avatar_url":"https://github.com/rmm5t.png","language":"Ruby","funding_links":["https://github.com/sponsors/rmm5t","https://www.paypal.me/rmm5t/5"],"categories":[],"sub_categories":[],"readme":"# BloomFit makes Bloom Filter tuning easy\n\n[![Gem Version](http://img.shields.io/gem/v/bloom_fit.svg)](https://rubygems.org/gems/bloom_fit)\n[![CI](https://github.com/rmm5t/bloom_fit/actions/workflows/ci.yml/badge.svg)](https://github.com/rmm5t/bloom_fit/actions/workflows/ci.yml)\n[![Gem Downloads](https://img.shields.io/gem/dt/bloom_fit.svg)](https://rubygems.org/gems/bloom_fit)\n\nBloomFit provides a MRI/C-based non-counting bloom filter for use in your Ruby projects. It is heavily based on [bloomfilter-rb]'s native implementation, but provides a better hashing distribution by using DJB2 over CRC32, avoids the need to supply a seed, removes counting abilities, improves performance for very large datasets, and will automatically calculate the bit size (m) and the number of hashes (k) when given a capacity and false-positive-rate.\n\nA [Bloom filter](http://en.wikipedia.org/wiki/Bloom_filter) is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positives are possible, but false negatives are not. Instead of using k different hash functions, this implementation a DJB2 hash with k seeds from the CRC table.\n\nPerformance of the Bloom filter depends on the following:\n\n- size of the bit array\n- number of hash functions\n\nBloomFit is a fork of [bloomfilter-rb].\n\n## Resources\n\n- Background: [Bloom filter](http://en.wikipedia.org/wiki/Bloom_filter)\n- Determining parameters: [Scalable Datasets: Bloom Filters in Ruby](http://www.igvita.com/2008/12/27/scalable-datasets-bloom-filters-in-ruby/)\n- Applications \u0026 reasons behind bloom filter: [Flow analysis: Time based bloom filter](http://www.igvita.com/2010/01/06/flow-analysis-time-based-bloom-filters/)\n\n## Examples\n\nMRI/C implementation which creates an in-memory filter which can be saved and reloaded from disk.\n\n(COMING SOON) If you'd like to specify an expected item count and a false-positive rate that you can tolerate. Visit the [Bloom Filter Calculator](https://hur.st/bloomfilter/) to learn more.\n\n```ruby\nrequire \"bloom_fit\"\n\nbf = BloomFit.new(capacity: 250, false_positive_rate: 0.001)\nbf.add(\"cat\")\nbf.include?(\"cat\")     # =\u003e true\nbf.include?(\"dog\")     # =\u003e false\n\n# Hash syntax with a bloom filter!\nbf[\"bird\"] = \"bar\"\nbf[\"bird\"]             # =\u003e true\nbf[\"mouse\"]            # =\u003e false\n\nbf.stats\n# =\u003e Number of filter bits (m): 3600\n# =\u003e Number of set bits (n): 20\n# =\u003e Number of filter hashes (k) : 10\n# =\u003e Predicted false positive rate = 0.00%\n```\n\nIf you'd like more control over the traditional inputs like bit size and the number of hashes:\n\n```ruby\nrequire \"bloom_fit\"\n\nbf = BloomFit.new(size: 100, hashes: 2)\nbf.add(\"cat\")\nbf.include?(\"cat\")     # =\u003e true\nbf.include?(\"dog\")     # =\u003e false\n\n# Hash syntax with a bloom filter!\nbf[\"bird\"] = \"bar\"\nbf[\"bird\"]             # =\u003e true\nbf[\"mouse\"]            # =\u003e false\n\nbf.stats\n# =\u003e Number of filter bits (m): 100\n# =\u003e Number of set bits (n): 4\n# =\u003e Number of filter hashes (k) : 2\n# =\u003e Predicted false positive rate = 10.87%\n```\n\n## Credits\n\n- Tatsuya Mori \u003cvaldzone@gmail.com\u003e (Original C implementation)\n- Ilya Grigorik [@igrigorik](https://github.com/igrigorik) ([bloomfilter-rb] gem)\n- Bharanee Rathna [@deepfryed](https://github.com/deepfryed) ([bloom-filter](https://github.com/deepfryed/bloom-filter) gem)\n\n## License\n\n[MIT License](https://rmm5t.mit-license.org/)\n\n[bloomfilter-rb]: https://github.com/igrigorik/bloomfilter-rb\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frmm5t%2Fbloom_fit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frmm5t%2Fbloom_fit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frmm5t%2Fbloom_fit/lists"}