{"id":15334314,"url":"https://github.com/richardstartin/splitmap","last_synced_at":"2025-04-15T03:02:23.307Z","repository":{"id":135934460,"uuid":"120205336","full_name":"richardstartin/splitmap","owner":"richardstartin","description":"Parallel boolean circuit evaluation","archived":false,"fork":false,"pushed_at":"2018-10-28T12:57:46.000Z","size":253,"stargazers_count":20,"open_issues_count":1,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-15T03:02:14.917Z","etag":null,"topics":["bitmap","bitset","boolean-algebra","boolean-circuits","indexing","roaringbitmap"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/richardstartin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2018-02-04T17:11:03.000Z","updated_at":"2025-02-06T08:36:36.000Z","dependencies_parsed_at":null,"dependency_job_id":"7cb2c9d1-c701-4c16-8a6e-d1beee87d592","html_url":"https://github.com/richardstartin/splitmap","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardstartin%2Fsplitmap","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardstartin%2Fsplitmap/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardstartin%2Fsplitmap/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardstartin%2Fsplitmap/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/richardstartin","download_url":"https://codeload.github.com/richardstartin/splitmap/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248997083,"owners_count":21195798,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bitmap","bitset","boolean-algebra","boolean-circuits","indexing","roaringbitmap"],"created_at":"2024-10-01T10:06:46.269Z","updated_at":"2025-04-15T03:02:23.289Z","avatar_url":"https://github.com/richardstartin.png","language":"Java","readme":"# splitmap\n\n[![Build Status](https://travis-ci.org/richardstartin/splitmap.svg?branch=master)](https://travis-ci.org/richardstartin/splitmap)\n[![Coverage Status](https://coveralls.io/repos/github/richardstartin/splitmap/badge.svg?branch=master)](https://coveralls.io/github/richardstartin/splitmap?branch=master)\n\nThis library builds on top of [RoaringBitmap](https://github.com/RoaringBitmap/RoaringBitmap) to provide a parallel implementation of boolean circuits (multidimensional filters) and arbitrary aggregations over filters.\n\nFor instance, to compute a sum product on a dataset filtered such that only one of two conditions holds:\n```java\n    PrefixIndex\u003cChunkedDoubleArray\u003e quantities = ...\n    PrefixIndex\u003cChunkedDoubleArray\u003e prices = ...\n    SplitMap februarySalesIndex = ...\n    SplitMap luxuryProductsIndex = ...\n    QueryContext\u003cString, PriceQty\u003e context = new QueryContext\u003c\u003e(\n    Map.ofEntries(entry(\"luxuryProducts\", luxuryProductsIndex), entry(\"febSales\", februarySalesIndex), \n    Map.ofEntries(entry(PRICE, prices), entry(QTY, quantities)))); \n\n    double februaryRevenueFromLuxuryProducts = \n            Circuits.evaluateIfKeysIntersect(context, slice -\u003e slice.get(\"febSales\").and(slice.get(\"luxuryProducts\")), \"febSales\", \"luxuryProducts\")\n            .stream()\n            .parallel()\n            .mapToDouble(partition -\u003e partition.reduceDouble(SumProduct.\u003cPriceQty\u003ereducer(price, quantities)))\n            .sum();\n```\n\nWhich, over millions of quantities and prices, can be computed in under 200 microseconds on a modern processor, where parallel streams may take upwards of 20ms.\n\nIt is easy to write arbitrary routines combining filtering, calculation and aggregation. For example statistical calculations evaluated with filter criteria.\n\n```java\n  public double productMomentCorrelationCoefficient() {\n    // calculate the correlation coefficient between prices observed on different exchanges\n    PrefixIndex\u003cChunkedDoubleArray\u003e exchange1Prices = ...\n    PrefixIndex\u003cChunkedDoubleArray\u003e exchange2Prices = ...\n    SplitMap beforeClose = ...\n    SplitMap afterOpen = ...\n    QueryContext\u003cExchange, PriceQty\u003e context = new QueryContext\u003c\u003e(\n    Map.ofEntries(entry(BEFORE_CLOSE, beforeClose), entry(AFTER_OPEN, afterOpen), \n    Map.ofEntries(entry(NASDAQ, exchange1Prices), entry(LSE, exchange2Prices)))); \n    // evaluate product moment correlation coefficient \n    return Circuits.evaluate(context, slice -\u003e slice.get(BEFORE_CLOSE).or(slice.get(AFTER_OPEN)), \n            BEFORE_CLOSE, AFTER_OPEN) \n            .stream()\n            .parallel()\n            .map(partition -\u003e partition.reduce(SimpleLinearRegression.\u003cExchanges\u003ereducer(exchange1Prices, exchange2Prices)))\n            .collect(SimpleLinearRegression.pmcc());\n  }\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frichardstartin%2Fsplitmap","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frichardstartin%2Fsplitmap","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frichardstartin%2Fsplitmap/lists"}