{"id":13482489,"url":"https://github.com/crystal-community/bloom_filter","last_synced_at":"2025-08-13T19:20:50.277Z","repository":{"id":45099734,"uuid":"51199275","full_name":"crystal-community/bloom_filter","owner":"crystal-community","description":"Bloom filter implementation in Crystal lang","archived":false,"fork":false,"pushed_at":"2023-08-28T12:42:43.000Z","size":29,"stargazers_count":34,"open_issues_count":2,"forks_count":6,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-10-30T16:40:44.252Z","etag":null,"topics":["bloom-filter","crystal"],"latest_commit_sha":null,"homepage":null,"language":"Crystal","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/crystal-community.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-02-06T11:02:48.000Z","updated_at":"2023-12-15T02:42:08.000Z","dependencies_parsed_at":"2024-01-14T19:13:18.476Z","dependency_job_id":"fba0774f-f3ce-4243-adef-7c79f19c8b7f","html_url":"https://github.com/crystal-community/bloom_filter","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crystal-community%2Fbloom_filter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crystal-community%2Fbloom_filter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crystal-community%2Fbloom_filter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crystal-community%2Fbloom_filter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/crystal-community","download_url":"https://codeload.github.com/crystal-community/bloom_filter/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227810270,"owners_count":17823176,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bloom-filter","crystal"],"created_at":"2024-07-31T17:01:02.511Z","updated_at":"2024-12-02T22:15:19.521Z","avatar_url":"https://github.com/crystal-community.png","language":"Crystal","funding_links":[],"categories":["Caching"],"sub_categories":[],"readme":"# Bloom Filter [![Build Status](https://travis-ci.org/crystal-community/bloom_filter.svg?branch=master)](https://travis-ci.org/crystal-community/bloom_filter)\n\nImplementation of [Bloom Filter](https://en.wikipedia.org/wiki/Bloom_filter) in [Crystal lang](http://crystal-lang.org/).\n\n* [Installation](#installation)\n* [Usage](#usage)\n  * [Basic](#basic)\n  * [Creating a filter with optimal parameters](#creating-a-filter-with-optimal-parameters)\n  * [Dumping to file and loading](#dumping-into-a-file-and-loading)\n  * [Union and intersection](#union-and-intersection)\n  * [Visualization](#visualization)\n* [Benchmark](#benchmark)\n* [Contributors](#contributors)\n\n\n## Installation\n\nAdd this to your application's `shard.yml`:\n\n```yaml\ndependencies:\n  bloom_filter:\n    github: crystal-community/bloom_filter\n```\n\n## Usage\n\n### Basic\n\n```crystal\nrequire \"bloom_filter\"\n\n# Create filter with bitmap size of 32 bytes and 3 hash functions.\nfilter = BloomFilter.new(bytesize = 32, hash_num = 3)\n\n# Insert elements\nfilter.insert(\"Esperanto\")\nfilter.insert(\"Toki Pona\")\n\n# Check elements presence\nfilter.has?(\"Esperanto\")  # =\u003e true\nfilter.has?(\"Toki Pona\")  # =\u003e true\nfilter.has?(\"Englsh\")     # =\u003e false\n```\n\n### Creating a filter with optimal parameters\n\nBased on your needs(expected number of items and desired probability of false positives),\nyour can create an optimal bloom filter:\n\n```crystal\n# Create a filter, that with one million inserted items, gives 2% of false positives for #has? method\nfilter = BloomFilter.new_optimal(1_000_000, 0.02)\nfilter.bytesize # =\u003e 1017796 (993Kb)\nfilter.hash_num # =\u003e 6\n```\n\n### Dumping into a file and loading\n\nIt's possible to save existing bloom filter as a binary file and then load it back.\n\n```crystal\nfilter = BloomFilter.new_optimal(2, 0.01)\nfilter.insert(\"Esperanto\")\nfilter.dump_file(\"/tmp/bloom_languages\")\n\nloaded_filter = BloomFilter.load_file(\"/tmp/bloom_languages\")\nloaded_filter.has?(\"Esperanto\") # =\u003e true\nloaded_filter.has?(\"English\")   # =\u003e false\n```\n\n### Union and intersection\nHaving two filters of the same size and number of hash functions, it's possible\nto perform union and intersection operations:\n\n```crystal\nf1 = BloomFilter.new(32, 3)\nf1.insert(\"Esperanto\")\nf1.insert(\"Spanish\")\n\nf2 = BloomFilter.new(32, 3)\nf2.insert(\"Esperanto\")\nf2.insert(\"English\")\n\n# Union\nf3 = f1 | f2\nf3.has?(\"Esperanto\") # =\u003e true\nf3.has?(\"Spanish\")   # =\u003e true\nf3.has?(\"English\")   # =\u003e true\n\n# Intersection\nf4 = f1 \u0026 f2\nf4.has?(\"Esperanto\") # =\u003e true\nf4.has?(\"Spanish\")   # =\u003e false\nf4.has?(\"English\")   # =\u003e false\n```\n\n### Visualization\n\nIf you want to see how your filter looks like, you can visualize it:\n\n```crystal\nf1 = BloomFilter.new(16, 2)\nf1.insert(\"Esperanto\")\nputs \"f1 = (Esperanto)\"\nputs f1.visualize\n\nf2 = BloomFilter.new(16, 2)\nf2.insert(\"Spanish\")\nputs \"f2 = (Spanish)\"\nputs f2.visualize\n\nf3 = f1 | f2\nputs \"f3 = f1 | f2 = (Esperanto, Spanish)\"\nputs f3.visualize\n```\n\nOutput:\n```\nf1 = (Esperanto)\n░░░░░░░░ ░░░░░░█░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░\n░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░█ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░\n\nf2 = (Spanish)\n░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░\n░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░█░ ░█░░░░░░\n\nf3 = f1 | f2 = (Esperanto, Spanish)\n░░░░░░░░ ░░░░░░█░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░\n░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░█ ░░░░░░░░ ░░░░░░█░ ░█░░░░░░\n```\nIn this way, you can actually see which bits are set:)\n\n## Benchmark\nPerformance of Bloom filter depends on the following parameters:\n* Size of the filter\n* Number of hash functions\n* Length of the input string\n\nTo run benchmark from `./samples/benchmark.cr`, simply run make task:\n```\n$ make benchmark\n\nNumber of items: 100000000\nFilter size: 117005Kb\nHash functions: 7\nString size: 13\n\n                     user     system      total        real\ninsert           0.004227   0.000000   0.004227 (  2.769349)\nhas? (present)   0.007980   0.000000   0.007980 (  5.223778)\nhas? (missing)   0.004318   0.000000   0.004318 (  2.829521)\n```\n\n## Contributors\n\n- [greyblake](https://github.com/greyblake) Potapov Sergey - creator, maintainer\n- [funny-falcon](https://github.com/funny-falcon) Sokolov Yura - better hash algorithms\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcrystal-community%2Fbloom_filter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcrystal-community%2Fbloom_filter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcrystal-community%2Fbloom_filter/lists"}