{"id":26665774,"url":"https://github.com/vladpodilnyk/probably","last_synced_at":"2025-03-25T17:38:53.091Z","repository":{"id":213437410,"uuid":"733099884","full_name":"VladPodilnyk/probably","owner":"VladPodilnyk","description":"A Bloom filter implementation in Go","archived":false,"fork":false,"pushed_at":"2024-01-09T22:06:08.000Z","size":10,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-06-21T20:37:41.269Z","etag":null,"topics":["data-structures","distributed-systems","probabilistic-data-structures"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/VladPodilnyk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-18T15:04:16.000Z","updated_at":"2024-03-18T19:19:54.000Z","dependencies_parsed_at":"2023-12-30T15:31:41.280Z","dependency_job_id":"e656a7fe-d4b1-4d76-921c-e15f76778c03","html_url":"https://github.com/VladPodilnyk/probably","commit_stats":null,"previous_names":["vladpodilnyk/probably"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VladPodilnyk%2Fprobably","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VladPodilnyk%2Fprobably/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VladPodilnyk%2Fprobably/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VladPodilnyk%2Fprobably/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/VladPodilnyk","download_url":"https://codeload.github.com/VladPodilnyk/probably/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245511453,"owners_count":20627397,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-structures","distributed-systems","probabilistic-data-structures"],"created_at":"2025-03-25T17:38:52.485Z","updated_at":"2025-03-25T17:38:53.085Z","avatar_url":"https://github.com/VladPodilnyk.png","language":"Go","readme":"### Probably\n\nProbably is a Bloom filter implementation for Golang.\nIt's quite simple to use and it doesn't have any external dependency.\n\n#### Usage\n\nTo create a Bloom filter it's required to specify the size of the set and a desired false\npositive rate.\n\n```go\nimport (\n    \"github.com/vladpodilnyk/probably\"\n)\n\nfunc main() {\n    filter := filters.NewBloomFilter(10, 0.01)\n}\n\n```\nAfter this it's possible to add values and check if they are in the set like that\n```go\nfilter.Add([]byte(\"hello\"))\nfilter.Contains([]byte(\"world\"))\n```\n\nIt's possible to join two Bloom filters with the same configurataion: same size and false positive rate.\nProbably provides two methods for this, `Merge` and `Union`.\nMerge joins two Bloom filters and stores the result in the first Bloom filter.\n```go\nfilter1 := filters.NewBloomFilter(10, 0.01)\nfilter2 := filters.NewBloomFilter(10, 0.01)\n\n// add values here\n\nfilter1.Merge(filter2)\n```\nWhereas Union joins two Bloom filters but returns the new Bloom filter as a result.\n```go\nfilter1 := filters.NewBloomFilter(10, 0.01)\nfilter2 := filters.NewBloomFilter(10, 0.01)\n\n// add values here\n\nresult := filter1.Union(filter2)\n```\nTo reset a filter state, call `Clear` method on the filter\n```go\nfilter.Clear()\n```\n\n#### Implementation details\n\nAs an underlying data structure, Probably uses a bit array that is implemented using\nbyte slices. So in case a user wants to allocate 9 bits then Probably will create\na 2 byte slice to hold the data in the filter.\n\nTo generate k hashes Probably uses only two hash functions: MD5 and SHA1.\nFor more information about this, please refer to this [amazing paper](https://www.eecs.harvard.edu/~michaelm/postscripts/tr-02-05.pdf) by Adam Kirsch.\n\n#### Future plans\nIt would be nice to extend Probably with other probabilistic data structures like HyperMinHash or Cuckoo filter\n\n#### Useful links\n\n- [Building a Better Bloom filter (paper) by Adam Kirch](https://www.eecs.harvard.edu/~michaelm/postscripts/tr-02-05.pdf)\n- [Slides from The Univesity of Texas at Austin](https://www.cs.utexas.edu/users/lam/396m/slides/Bloom_filters.pdf)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvladpodilnyk%2Fprobably","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvladpodilnyk%2Fprobably","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvladpodilnyk%2Fprobably/lists"}