{"id":16965016,"url":"https://github.com/propensive/ulysses","last_synced_at":"2025-03-21T17:27:39.931Z","repository":{"id":190154343,"uuid":"681796496","full_name":"propensive/ulysses","owner":"propensive","description":"An implementation of Bloom filters for Scala","archived":false,"fork":false,"pushed_at":"2025-01-26T12:16:16.000Z","size":3586,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-19T23:34:41.329Z","etag":null,"topics":["bloom","bloom-filters","scala"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/propensive.png","metadata":{"files":{"readme":".github/readme.md","changelog":null,"contributing":".github/contributing.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-22T19:18:52.000Z","updated_at":"2025-01-26T12:16:19.000Z","dependencies_parsed_at":"2024-02-27T23:29:18.509Z","dependency_job_id":"05656696-35fe-4849-857c-af95fbc208f6","html_url":"https://github.com/propensive/ulysses","commit_stats":null,"previous_names":["propensive/fluorescent"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/propensive%2Fulysses","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/propensive%2Fulysses/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/propensive%2Fulysses/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/propensive%2Fulysses/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/propensive","download_url":"https://codeload.github.com/propensive/ulysses/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244837061,"owners_count":20518559,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bloom","bloom-filters","scala"],"created_at":"2024-10-13T23:44:49.942Z","updated_at":"2025-03-21T17:27:39.908Z","avatar_url":"https://github.com/propensive.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"[\u003cimg alt=\"GitHub Workflow\" src=\"https://img.shields.io/github/actions/workflow/status/propensive/ulysses/main.yml?style=for-the-badge\" height=\"24\"\u003e](https://github.com/propensive/ulysses/actions)\n[\u003cimg src=\"https://img.shields.io/discord/633198088311537684?color=8899f7\u0026label=DISCORD\u0026style=for-the-badge\" height=\"24\"\u003e](https://discord.com/invite/MBUrkTgMnA)\n\u003cimg src=\"/doc/images/github.png\" valign=\"middle\"\u003e\n\n# Ulysses\n\n__An implementation of Bloom filters for Scala__\n\nBloom filters are useful when a _probabilistic_ answer for whether a set\ncontains a value is sufficient, particularly when the cost of storing every set\nelement is prohibitive. _Ulysses_ provides an immutable representation of a\nBloom filter, with flexibility to tune its properties according to need.\n\n## Features\n\n- Provides a generic, immutable implementation of a Bloom filter\n- Can be configured to use a variety of different hash functions\n- Automatically determines Bloom filter parameters for target number of elements and error rate\n- Uses Gastronomy to derive hashes for primitive, product and sum types\n\n\n## Availability\n\n\n\n\n\n\n\n## Getting Started\n\n### About Bloom Filters\n\nA Bloom filter provides a way to test if a value is _probably_ in a set, or if\nit is _certainly_ not in the set. That is, false positives for set membership\nare permitted, while false negatives are not. The memory required by a Bloom\nfilter is dependent on the number of elements it should hold, and the\nprobability of false positives when testing for set membership, and these\nparameters must be specified when the Bloom filter is created.\n\nAs well as determining the amount of memory the Bloom filter should use, the\napproximate number of elements and the acceptable false-positive rate determine\nthe number of different hashes that will be used for each operation, with\noptimal values chosen for each, transparently.\n\nThe Bloom filter only has two core operations: adding a value to the set, and\ntesting if a value is contained within the set. Since the Bloom filter does not\nactually store any values, there are no methods for retrieval.\n\n### Constructing a Bloom Filter\n\nA new Bloom filter can be constructed with, for example:\n```scala\nimport gastronomy.hashFunctions.crc32\nval bloom = BloomFilter[Element](1000, 0.01)\n```\n\nThis creates a new `BloomFilter` instance for storing elements of type\n`Element`, optimized for 1000 elements, with a target error rate of 1%\n(`0.01`). Additionally, the presence of the contextual value,\n`hashFunctions.crc32` menas that the CRC32 hash function, defined in\n[Gastronomy](https://github.com/propensive/gastronomy/), will be used to\ncalculate the hashes for addition and membership checks.\n\nFor creation, the Bloom filter additionally requires that its element types are\ndigestible, that is, a contextual `Digestible` instance exists. Gastronomy\nprovides `Digestible` instances for primitive types and will derive instances for\nproduct and coproduct types.\n\nA newly-created `BloomFilter` is empty, but an element can be added with the\n`+` operator, or multiple elements with the `++` operator. Since `BloomFilter`s\nare immutable, these will construct new `BloomFilter` instances.\n\n### Using a Bloom filter\n\n`BloomFilter` also provides the method, `mayContain`, which takes an instance of\nthe Bloom filter's type, and returns a `Boolean`. The interpretation of the\nresult should not be mistaken: `false` means that the value is guaranteed not\nto be a member of the set represented by this Bloom filter, while `true` means\nthat the value is _probably_ a member of the set, but may not be.\n\nFor the Bloom filter above, constructed for approximately 1000 elements with an\nerror rate of 1%, if it has, indeed, had 1000 elements added to it, then there\nis an estimated 1% chance that the `mayContain` method will return `true` for\nan element which has not been added. That false-positive probably will increase\nif significantly more elements are added to the Bloom filter, and would be\nsmaller had significantly fewer elements been added.\n\n\n\n\n\n## Status\n\nUlysses is classified as __fledgling__. For reference, Soundness projects are\ncategorized into one of the following five stability levels:\n\n- _embryonic_: for experimental or demonstrative purposes only, without any guarantees of longevity\n- _fledgling_: of proven utility, seeking contributions, but liable to significant redesigns\n- _maturescent_: major design decisions broady settled, seeking probatory adoption and refinement\n- _dependable_: production-ready, subject to controlled ongoing maintenance and enhancement; tagged as version `1.0.0` or later\n- _adamantine_: proven, reliable and production-ready, with no further breaking changes ever anticipated\n\nProjects at any stability level, even _embryonic_ projects, can still be used,\nas long as caution is taken to avoid a mismatch between the project's stability\nlevel and the required stability and maintainability of your own project.\n\nUlysses is designed to be _small_. Its entire source code currently consists\nof 80 lines of code.\n\n## Building\n\nUlysses will ultimately be built by Fury, when it is published. In the\nmeantime, two possibilities are offered, however they are acknowledged to be\nfragile, inadequately tested, and unsuitable for anything more than\nexperimentation. They are provided only for the necessity of providing _some_\nanswer to the question, \"how can I try Ulysses?\".\n\n1. *Copy the sources into your own project*\n   \n   Read the `fury` file in the repository root to understand Ulysses's build\n   structure, dependencies and source location; the file format should be short\n   and quite intuitive. Copy the sources into a source directory in your own\n   project, then repeat (recursively) for each of the dependencies.\n\n   The sources are compiled against the latest nightly release of Scala 3.\n   There should be no problem to compile the project together with all of its\n   dependencies in a single compilation.\n\n2. *Build with [Wrath](https://github.com/propensive/wrath/)*\n\n   Wrath is a bootstrapping script for building Ulysses and other projects in\n   the absence of a fully-featured build tool. It is designed to read the `fury`\n   file in the project directory, and produce a collection of JAR files which can\n   be added to a classpath, by compiling the project and all of its dependencies,\n   including the Scala compiler itself.\n   \n   Download the latest version of\n   [`wrath`](https://github.com/propensive/wrath/releases/latest), make it\n   executable, and add it to your path, for example by copying it to\n   `/usr/local/bin/`.\n\n   Clone this repository inside an empty directory, so that the build can\n   safely make clones of repositories it depends on as _peers_ of `ulysses`.\n   Run `wrath -F` in the repository root. This will download and compile the\n   latest version of Scala, as well as all of Ulysses's dependencies.\n\n   If the build was successful, the compiled JAR files can be found in the\n   `.wrath/dist` directory.\n\n## Contributing\n\nContributors to Ulysses are welcome and encouraged. New contributors may like\nto look for issues marked\n[beginner](https://github.com/propensive/ulysses/labels/beginner).\n\nWe suggest that all contributors read the [Contributing\nGuide](/contributing.md) to make the process of contributing to Ulysses\neasier.\n\nPlease __do not__ contact project maintainers privately with questions unless\nthere is a good reason to keep them private. While it can be tempting to\nrepsond to such questions, private answers cannot be shared with a wider\naudience, and it can result in duplication of effort.\n\n## Author\n\nUlysses was designed and developed by Jon Pretty, and commercial support and\ntraining on all aspects of Scala 3 is available from [Propensive\nO\u0026Uuml;](https://propensive.com/).\n\n\n\n## Name\n\n_Ulysses_ is named after the novel by James Joyce, whose principal character is Leopold Bloom, a namesake of the creator of Bloom Filters, Burton H. Bloom.\n\nIn general, Soundness project names are always chosen with some rationale,\nhowever it is usually frivolous. Each name is chosen for more for its\n_uniqueness_ and _intrigue_ than its concision or catchiness, and there is no\nbias towards names with positive or \"nice\" meanings—since many of the libraries\nperform some quite unpleasant tasks.\n\nNames should be English words, though many are obscure or archaic, and it\nshould be noted how willingly English adopts foreign words. Names are generally\nof Greek or Latin origin, and have often arrived in English via a romance\nlanguage.\n\n## Logo\n\nThe logo shows a blooming lotus flower, alluding to the _Bloom_ filters that Ulysses provides.\n\n## License\n\nUlysses is copyright \u0026copy; 2025 Jon Pretty \u0026 Propensive O\u0026Uuml;, and\nis made available under the [Apache 2.0 License](/license.md).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpropensive%2Fulysses","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpropensive%2Fulysses","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpropensive%2Fulysses/lists"}