{"id":15176439,"url":"https://github.com/gyson/blex","last_synced_at":"2025-06-19T08:33:46.918Z","repository":{"id":55907355,"uuid":"164249162","full_name":"gyson/blex","owner":"gyson","description":"Fast Bloom filter with concurrent accessibility, powered by :atomics module.","archived":false,"fork":false,"pushed_at":"2020-12-08T04:17:26.000Z","size":24,"stargazers_count":39,"open_issues_count":1,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-01-31T18:57:04.224Z","etag":null,"topics":["atomics","bloom","bloom-filter","elixir","erlang","filter","probabilistic-data-structures"],"latest_commit_sha":null,"homepage":"","language":"Elixir","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gyson.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-01-05T20:35:31.000Z","updated_at":"2024-11-30T16:13:38.000Z","dependencies_parsed_at":"2022-08-15T09:10:27.306Z","dependency_job_id":null,"html_url":"https://github.com/gyson/blex","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gyson%2Fblex","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gyson%2Fblex/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gyson%2Fblex/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gyson%2Fblex/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gyson","download_url":"https://codeload.github.com/gyson/blex/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":238319483,"owners_count":19452343,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["atomics","bloom","bloom-filter","elixir","erlang","filter","probabilistic-data-structures"],"created_at":"2024-09-27T13:04:12.770Z","updated_at":"2025-02-11T15:31:35.664Z","avatar_url":"https://github.com/gyson.png","language":"Elixir","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Blex\n\nBlex is a fast Bloom filter with **concurrent accessibility**, powered by [`:atomics`](http://erlang.org/doc/man/atomics.html) module.\n\n## Features\n\n* Fixed size Bloom filter\n* Concurrent reads \u0026 writes\n* Serialization\n* Merge multiple Bloom filters into one\n* Only one copy of data because data is saved in either `:atomics` or binary (if \u003e 64 bytes)\n* Custom hash functions\n\n## Example\n\n```elixir\niex\u003e b = Blex.new(1000, 0.01)\niex\u003e Task.async(fn -\u003e Blex.put(b, \"hello\") end) |\u003e Task.await()\niex\u003e Task.async(fn -\u003e Blex.put(b, \"world\") end) |\u003e Task.await()\niex\u003e Blex.member?(b, \"hello\")\ntrue\niex\u003e Blex.member?(b, \"world\")\ntrue\niex\u003e Blex.member?(b, \"others\")\nfalse\n```\n\n## Installation\n\n**Note**: it requires OTP-21.2.1 or later. OTP-21.2 is not good due to a [issue](https://github.com/erlang/otp/pull/2061).\n\nIt can be installed by adding `blex` to your list of dependencies in `mix.exs`:\n\n```elixir\ndef deps do\n  [\n    {:blex, \"~\u003e 0.2\"}\n  ]\nend\n```\n\n## Documentation\n\nDocumentation can be found at [hexdocs.pm/blex/Blex.html](https://hexdocs.pm/blex/Blex.html).\n\n## Benchmarking\n\nCompare to alternative Bloom filter powered by `:array` module,\n\nBlex is faster with read operation:\n\n```\nOperating System: macOS\"\nCPU Information: Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz\nNumber of Available Cores: 8\nAvailable memory: 16 GB\nElixir 1.7.4\nErlang 21.2.2\n\nBenchmark suite executing with the following configuration:\nwarmup: 2 s\ntime: 5 s\nmemory time: 0 μs\nparallel: 1\ninputs: none specified\nEstimated total run time: 21 s\n\n\nBenchmarking Blex.members?...\nBenchmarking Blex.members? with binary format...\nBenchmarking Bloomex.members?...\n\nName                                       ips        average  deviation         median         99th %\nBlex.members? with binary format          0.69         1.44 s     ±0.23%         1.44 s         1.44 s\nBlex.members?                             0.63         1.58 s     ±0.61%         1.58 s         1.58 s\nBloomex.members?                          0.40         2.51 s     ±0.00%         2.51 s         2.51 s\n\nComparison:\nBlex.members? with binary format          0.69\nBlex.members?                             0.63 - 1.09x slower\nBloomex.members?                          0.40 - 1.74x slower\n```\n\nBlex is much faster with write operation:\n\n```\nOperating System: macOS\"\nCPU Information: Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz\nNumber of Available Cores: 8\nAvailable memory: 16 GB\nElixir 1.7.4\nErlang 21.2.2\n\nBenchmark suite executing with the following configuration:\nwarmup: 2 s\ntime: 10 s\nmemory time: 0 μs\nparallel: 1\ninputs: none specified\nEstimated total run time: 24 s\n\n\nBenchmarking Blex.put...\nBenchmarking Bloomex.add...\n\nName                  ips        average  deviation         median         99th %\nBlex.put             0.44         2.25 s     ±3.98%         2.30 s         2.33 s\nBloomex.add         0.126         7.91 s     ±0.22%         7.91 s         7.92 s\n\nComparison:\nBlex.put             0.44\nBloomex.add         0.126 - 3.51x slower\n```\n\nAbove benchmarking script is available at `bench/comparison.exs`.\n\n## Implementation\n\nInstead of traditional Bloom filter, partitioned Bloom filter (a variant Bloom filter described in section 3 of\n[the paper](http://gsd.di.uminho.pt/members/cbm/ps/dbloom.pdf)) is used for performance benefits. The partitioned\nBloom filter would partition bits array into **k** parts where **k** is number of hash functions. Each hash functions\nwould only read \u0026 write bits from its own partitioned space. This would bring following benefits:\n\n  * Reduce hash function (`:erlang.phash2`) calls for some cases.\n  * Speed up `Blex.estimate_size` by scanning only part of bits.\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgyson%2Fblex","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgyson%2Fblex","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgyson%2Fblex/lists"}