{"id":13836137,"url":"https://github.com/barrust/bloom","last_synced_at":"2025-08-08T19:08:55.179Z","repository":{"id":36377926,"uuid":"40682789","full_name":"barrust/bloom","owner":"barrust","description":"Bloom filter implementation ","archived":false,"fork":false,"pushed_at":"2023-02-12T13:15:05.000Z","size":129,"stargazers_count":40,"open_issues_count":1,"forks_count":11,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-30T20:27:23.059Z","etag":null,"topics":["bloom-filter","c","data-structures","filter","probabilistic"],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/barrust.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2015-08-13T21:33:43.000Z","updated_at":"2025-04-28T15:26:10.000Z","dependencies_parsed_at":"2024-01-13T16:45:24.099Z","dependency_job_id":"b67c302c-14a1-4f73-b941-6bb63b27fa50","html_url":"https://github.com/barrust/bloom","commit_stats":null,"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/barrust%2Fbloom","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/barrust%2Fbloom/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/barrust%2Fbloom/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/barrust%2Fbloom/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/barrust","download_url":"https://codeload.github.com/barrust/bloom/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251777475,"owners_count":21642164,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bloom-filter","c","data-structures","filter","probabilistic"],"created_at":"2024-08-04T15:00:36.482Z","updated_at":"2025-04-30T20:27:29.715Z","avatar_url":"https://github.com/barrust.png","language":"C","funding_links":[],"categories":["C"],"sub_categories":[],"readme":"# bloom\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)\n[![GitHub release](https://img.shields.io/github/v/release/barrust/bloom.svg)](https://github.com/barrust/bloom/releases)\n[![C/C++ CI](https://github.com/barrust/bloom/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/barrust/bloom/actions/workflows/ci.yml)\n[![codecov](https://codecov.io/gh/barrust/bloom/branch/master/graph/badge.svg)](https://codecov.io/gh/barrust/bloom)\n\nBloom Filter implementation written in **C**\n\nBloom Filters are a probabilistic data structure that allows for the storage and\nlook up of elements. The data stored in a Bloom Filter is not retrievable. Once\ndata is 'inserted', data can be checked to see if it likely has been seen or if\nit definitely has not. Bloom Filters guarantee a 0% False Negative rate with a\npre-selected false positive rate.\n\nTo use the library, copy the `src/bloom.h` and `src/bloom.c` files into your\nproject and include it where needed.\n\n## License:\nMIT 2015 - 2021\n\n\n## Main Features:\n* Set upper bound number of elements and desired false positive rate; the system\nwill determine number of hashes and number of bits required\n* Custom hashing algorithms support\n* Import and export either as file or as hex string\n    * Keeps everything but the hashing algorithm\n    * Hex can be used if needing to store as a string\n    * File base can be loaded either on disk or into memory\n* Ability to read Bloom Filter on disk instead of in memory if needed\n* Add or check for presence in the filter by using either the string or hashes\n    * Using hashes can be used to check many similar Bloom Filters while only\n    needing to hash the string once\n* Calculate current false positive rate\n* Union and Intersection of Bloom Filters\n* Calculate the Jaccard Index between two Bloom Filters\n* **OpenMP** support for generation and lookup\n    * Ensure the `bloom.c` file is compiled with `-fopenmp` along with the utilizing program\n\n\n## Future Enhancements\n* What would the difference between two Bloom Filters signify?\n\n\n## Usage:\n``` c\n#include \"bloom.h\"\n\nBloomFilter bf;\n/*  elements = 10;\n    false positive rate = 5% */\nbloom_filter_init(\u0026bf, 10, 0.05);\nbloom_filter_add_string(\u0026bf, \"test\");\nif (bloom_filter_check_string(\u0026bf, \"test\") == BLOOM_FAILURE) {\n    printf(\"'test' is not in the Bloom Filter\\n\");\n} else {\n    printf(\"'test' is in the Bloom Filter\\n\");\n}\nif (bloom_filter_check_string(\u0026bf, \"blah\") == BLOOM_FAILURE) {\n    printf(\"'blah' is not in the Bloom Filter!\\n\");\n} else {\n    printf(\"'blah' is in th Bloom Filter\\n\");\n}\nbloom_filter_stats(\u0026bf);\nbloom_filter_destroy(\u0026bf);\n```\n\n### User Defined Hash Function Example\n``` c\n#include \u003cstdlib.h\u003e\n#include \u003cstdio.h\u003e\n#include \u003cstring.h\u003e\n#include \u003copenssl/sha.h\u003e\n#include \"bloom.h\"\n\n/* Example of a custom hashing function */\nuint64_t* sha256_hash(int num_hashes, char* str) {\n    uint64_t* results = calloc(num_hashes, sizeof(uint64_t));\n    unsigned char digest[SHA256_DIGEST_LENGTH];\n    int i;\n    for (i = 0; i \u003c num_hashes; i++) {\n        SHA256_CTX sha256_ctx;\n        SHA256_Init(\u0026sha256_ctx);\n        if (i == 0) {\n            SHA256_Update(\u0026sha256_ctx, str, strlen(str));\n        } else {\n            SHA256_Update(\u0026sha256_ctx, digest, SHA256_DIGEST_LENGTH);\n        }\n        SHA256_Final(digest, \u0026sha256_ctx);\n        results[i] = (uint64_t)* (uint64_t* )digest;\n    }\n    return results;\n}\n\nBloomFilter bf;\n/*  elements = 10;\n    false positive rate = 5%\n    custom hashing algorithm = sha256_hash function */\nbloom_filter_init_alt(\u0026bf, 10, 0.05, \u0026sha256_hash);\nbloom_filter_add_string(\u0026bf, \"test\");\nif (bloom_filter_check_string(\u0026bf, \"test\") == BLOOM_FAILURE) {\n    printf(\"'test' is not in the Bloom Filter\\n\");\n} else {\n    printf(\"'test' is in the Bloom Filter\\n\");\n}\nif (bloom_filter_check_string(\u0026bf, \"blah\") == BLOOM_FAILURE) {\n    printf(\"'blah' is not in the Bloom Filter!\\n\");\n} else {\n    printf(\"'blah' is in th Bloom Filter\\n\");\n}\nbloom_filter_stats(\u0026bf);\nbloom_filter_destroy(\u0026bf);\n```\n\n## Required Compile Flags:\n-lm\n\n\n## Backward Compatible Hash Function\nTo use the older bloom filters (v1.8.2 or lower) that utilized the default hashing\nalgorithm, then change use the following code as the hash function:\n\n``` c\n/* NOTE: The caller will free the results */\nstatic uint64_t* original_default_hash(unsigned int num_hashes, const char* str) {\n    uint64_t *results = (uint64_t*)calloc(num_hashes, sizeof(uint64_t));\n    char key[17] = {0}; // largest value is 7FFF,FFFF,FFFF,FFFF\n    results[0] = __fnv_1a(str);\n    for (unsigned int i = 1; i \u003c num_hashes; ++i) {\n        sprintf(key, \"%\" PRIx64 \"\", results[i-1]);\n        results[i] = old_fnv_1a(key);\n    }\n    return results;\n}\n\nstatic uint64_t old_fnv_1a(const char* key) {\n    // FNV-1a hash (http://www.isthe.com/chongo/tech/comp/fnv/)\n    int i, len = strlen(key);\n    uint64_t h = 14695981039346656073ULL; // FNV_OFFSET 64 bit\n    for (i = 0; i \u003c len; ++i){\n            h = h ^ (unsigned char) key[i];\n            h = h * 1099511628211ULL; // FNV_PRIME 64 bit\n    }\n    return h;\n}\n```\n\nIf using only older Bloom Filters, then you can update the // FNV_OFFSET 64 bit\nto use `14695981039346656073ULL`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbarrust%2Fbloom","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbarrust%2Fbloom","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbarrust%2Fbloom/lists"}