{"id":13458223,"url":"https://github.com/jaybaird/python-bloomfilter","last_synced_at":"2025-12-30T02:25:15.519Z","repository":{"id":464658,"uuid":"89242","full_name":"jaybaird/python-bloomfilter","owner":"jaybaird","description":"Scalable Bloom Filter implemented in Python","archived":true,"fork":false,"pushed_at":"2021-07-01T08:40:04.000Z","size":341,"stargazers_count":1620,"open_issues_count":25,"forks_count":330,"subscribers_count":50,"default_branch":"master","last_synced_at":"2025-03-01T01:47:28.072Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jaybaird.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGES.txt","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2008-12-12T00:46:27.000Z","updated_at":"2025-02-24T01:55:59.000Z","dependencies_parsed_at":"2022-07-08T02:16:20.366Z","dependency_job_id":null,"html_url":"https://github.com/jaybaird/python-bloomfilter","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jaybaird%2Fpython-bloomfilter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jaybaird%2Fpython-bloomfilter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jaybaird%2Fpython-bloomfilter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jaybaird%2Fpython-bloomfilter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jaybaird","download_url":"https://codeload.github.com/jaybaird/python-bloomfilter/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245297923,"owners_count":20592500,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T09:00:47.849Z","updated_at":"2025-12-14T14:08:58.700Z","avatar_url":"https://github.com/jaybaird.png","language":"Python","readme":"pybloom\n=======\n\n.. image:: https://travis-ci.org/jaybaird/python-bloomfilter.svg?branch=master\n    :target: https://travis-ci.org/jaybaird/python-bloomfilter\n\n``pybloom`` is a module that includes a Bloom Filter data structure along with\nan implmentation of Scalable Bloom Filters as discussed in:\n\nP. Almeida, C.Baquero, N. Preguiça, D. Hutchison, Scalable Bloom Filters,\n(GLOBECOM 2007), IEEE, 2007.\n\nBloom filters are great if you understand what amount of bits you need to set\naside early to store your entire set. Scalable Bloom Filters allow your bloom\nfilter bits to grow as a function of false positive probability and size.\n\nA filter is \"full\" when at capacity: M * ((ln 2 ^ 2) / abs(ln p)), where M\nis the number of bits and p is the false positive probability. When capacity\nis reached a new filter is then created exponentially larger than the last\nwith a tighter probability of false positives and a larger number of hash\nfunctions.\n\n.. code-block:: python\n\n    \u003e\u003e\u003e from pybloom import BloomFilter\n    \u003e\u003e\u003e f = BloomFilter(capacity=1000, error_rate=0.001)\n    \u003e\u003e\u003e [f.add(x) for x in range(10)]\n    [False, False, False, False, False, False, False, False, False, False]\n    \u003e\u003e\u003e all([(x in f) for x in range(10)])\n    True\n    \u003e\u003e\u003e 10 in f\n    False\n    \u003e\u003e\u003e 5 in f\n    True\n    \u003e\u003e\u003e f = BloomFilter(capacity=1000, error_rate=0.001)\n    \u003e\u003e\u003e for i in xrange(0, f.capacity):\n    ...     _ = f.add(i)\n    \u003e\u003e\u003e (1.0 - (len(f) / float(f.capacity))) \u003c= f.error_rate + 2e-18\n    True\n\n    \u003e\u003e\u003e from pybloom import ScalableBloomFilter\n    \u003e\u003e\u003e sbf = ScalableBloomFilter(mode=ScalableBloomFilter.SMALL_SET_GROWTH)\n    \u003e\u003e\u003e count = 10000\n    \u003e\u003e\u003e for i in xrange(0, count):\n    ...     _ = sbf.add(i)\n    ...\n    \u003e\u003e\u003e (1.0 - (len(sbf) / float(count))) \u003c= sbf.error_rate + 2e-18\n    True\n\n    # len(sbf) may not equal the entire input length. 0.01% error is well\n    # below the default 0.1% error threshold. As the capacity goes up, the\n    # error will approach 0.1%.\n","funding_links":[],"categories":["Python","Awesome Algorithms","**Programming (learning)**"],"sub_categories":["bloom - Bloom Filter (布隆过滤器)","**Developer\\'s Tools**"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjaybaird%2Fpython-bloomfilter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjaybaird%2Fpython-bloomfilter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjaybaird%2Fpython-bloomfilter/lists"}