{"id":21843951,"url":"https://github.com/fsaintjacques/jitmap","last_synced_at":"2026-03-11T03:03:23.074Z","repository":{"id":141872796,"uuid":"227614424","full_name":"fsaintjacques/jitmap","owner":"fsaintjacques","description":"LLVM-jitted bitmaps","archived":false,"fork":false,"pushed_at":"2020-04-23T23:30:29.000Z","size":435,"stargazers_count":27,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-06-14T02:40:28.681Z","etag":null,"topics":["bitmap","jit","llvm"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fsaintjacques.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-12-12T13:33:13.000Z","updated_at":"2025-04-13T05:10:34.000Z","dependencies_parsed_at":null,"dependency_job_id":"d9bddbe6-551f-4d05-96e2-01d5214f87e3","html_url":"https://github.com/fsaintjacques/jitmap","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/fsaintjacques/jitmap","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fsaintjacques%2Fjitmap","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fsaintjacques%2Fjitmap/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fsaintjacques%2Fjitmap/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fsaintjacques%2Fjitmap/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fsaintjacques","download_url":"https://codeload.github.com/fsaintjacques/jitmap/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fsaintjacques%2Fjitmap/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30368604,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-10T21:41:54.280Z","status":"online","status_checked_at":"2026-03-11T02:00:07.027Z","response_time":84,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bitmap","jit","llvm"],"created_at":"2024-11-27T22:17:46.882Z","updated_at":"2026-03-11T03:03:23.045Z","avatar_url":"https://github.com/fsaintjacques.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# jitmap: Jitted bitmaps\n\njitmap is a small library providing an execution engine for logical binary\nexpressions on bitmaps. Some examples where this is relevant:\n\n* In search engines, posting lists (sorted sequences of integers) are encoded\n  with bitmaps. Evaluating a search query (logical expression on\n  keywords) can be implemented with logical expression on bitmaps.\n\n* In columnar databases, selection vectors (index masks) are encoded with\n  bitmaps, the results of predicate on column expressions. The bitmaps are then\n  combined in a final bitmap.\n\n* In stream processing systems with rule engines, e.g. adtech bid requests\n  filtering with campaign rules, bitmaps are used as a first-pass optimization\n  to lower the number of (costly) rules to evaluate on each incoming event.\n\njitmap compiles logical expressions into native functions with signature\n`void fn(const char**, char*)`. The functions are optimized to minimize memory\ntransfers and uses the fastest vector instruction set provided by the host.\n\nThe following snippet shows an example of what jitmap achieves:\n\n```C\ntypedef void (*dense_eval_fn)(const char**, char*);\n\n// a, b, c, and output are pointers to bitmap\nchar* a, b, c, output;\n// Note that for now, jitmap only supports static sized bitmaps.\nconst char** inputs[3] = {a, b, c};\n\n// Compile an expression returned as a function pointer. The function can be\n// called from any thread in the same address space and has a global lifetime.\n// The generated symbol will be exposed to gdb and linux's perf utility.\nconst char* symbol_name = \"a_and_b_and_c\";\ndense_eval_fn a_and_b_and_c = jitmap_compile(symbol_name, \"a \u0026 b \u0026 c\");\n\n// The result of `a \u0026 b \u0026 c` will be stored in `output`, applied vertically\n// using vectorized instruction available on the host.\na_and_b_and_c(inputs, output);\n```\n\n## Logical expression language\n\njitmap offers a small DSL language to evaluate bitwise operations on bitmaps.\nThe language supports variables (named bitmap), empty/full literals, and basic\noperators: not `!`, and `\u0026`, or `!`, xor `^`.\n\nA query takes an expression and a list of bitmaps and execute the expression on\nthe bitmaps resulting in a new bitmap.\n\n### Supported expressions\n\n - Empty bitmap literal: `$0`\n - Full bitmap literal: `$1`\n - Variables (named bitmap): `[A-Za-z0-9_]+`, e.g. `country`, `color_red`\n - Not: `!e`\n - And: `e_1 \u0026 e_2`\n - Or: `e_1 | e_2`\n - Xor: `e_1 ^ e_2`\n\n### Examples\n```\n# NOT(a)\n!a\n\n# a AND b\na \u0026 b\n\n# 1 AND (a OR b) XOR c\n($1 \u0026 (a | b) ^ c)\n```\n\n## Developing/Debugging\n\n### *jitmap-ir* tool\n\nThe *jitmap-ir* command line utility takes an expression as first input argument\nand dumps the generated LLVM' IR to stdout. It is useful to debug and peek at\nthe generated code. Using LLVM command line utilies, we can also look at the\nexpected generated assembly for any platform.\n\n```llvm\n# tools/jitmap-ir \"(a \u0026 b) | (c \u0026 c) | (c ^ d) | (c \u0026 b) | (d ^ a)\"\n; ModuleID = 'jitmap_ir'\nsource_filename = \"jitmap_ir\"\ntarget triple = \"x86_64-pc-linux-gnu\"\n\n; Function Attrs: argmemonly\ndefine void @query(i32** nocapture readonly %inputs, i32* nocapture %output) #0 {\nentry:\n  %bitmap_gep_0 = getelementptr inbounds i32*, i32** %inputs, i64 0\n  %bitmap_0 = load i32*, i32** %bitmap_gep_0\n  %bitmap_gep_1 = getelementptr inbounds i32*, i32** %inputs, i64 1\n  %bitmap_1 = load i32*, i32** %bitmap_gep_1\n  %bitmap_gep_2 = getelementptr inbounds i32*, i32** %inputs, i64 2\n  %bitmap_2 = load i32*, i32** %bitmap_gep_2\n  %bitmap_gep_3 = getelementptr inbounds i32*, i32** %inputs, i64 3\n  %bitmap_3 = load i32*, i32** %bitmap_gep_3\n  br label %loop\n\nloop:                                             ; preds = %loop, %entry\n  %i = phi i64 [ 0, %entry ], [ %next_i, %loop ]\n  %gep_0 = getelementptr inbounds i32, i32* %bitmap_0, i64 %i\n  %load_0 = load i32, i32* %gep_0\n  %gep_1 = getelementptr inbounds i32, i32* %bitmap_1, i64 %i\n  %load_1 = load i32, i32* %gep_1\n  %gep_2 = getelementptr inbounds i32, i32* %bitmap_2, i64 %i\n  %load_2 = load i32, i32* %gep_2\n  %gep_3 = getelementptr inbounds i32, i32* %bitmap_3, i64 %i\n  %load_3 = load i32, i32* %gep_3\n  %0 = and i32 %load_0, %load_1\n  %1 = and i32 %load_2, %load_2\n  %2 = or i32 %0, %1\n  %3 = xor i32 %load_2, %load_3\n  %4 = or i32 %2, %3\n  %5 = and i32 %load_2, %load_1\n  %6 = or i32 %4, %5\n  %7 = xor i32 %load_3, %load_0\n  %8 = or i32 %6, %7\n  %gep_output = getelementptr inbounds i32, i32* %output, i64 %i\n  store i32 %8, i32* %gep_output\n  %next_i = add i64 %i, 1\n  %exit_cond = icmp eq i64 %next_i, 2048\n  br i1 %exit_cond, label %after_loop, label %loop\n\nafter_loop:                                       ; preds = %loop\n  ret void\n}\n\nattributes #0 = { argmemonly }\n```\n\nWe can then use LLVM's `opt` and `llc` to transform the IR into native assembly.\n\n```objdump\n# tools/jitmap-ir \"(a \u0026 b) | (c \u0026 c) | (c ^ d) | (c \u0026 b) | (d ^ a)\" | llc -O3\n        .text\n        .file   \"jitmap_ir\"\n        .globl  query                   # -- Begin function query\n        .p2align        4, 0x90\n        .type   query,@function\nquery:                                  # @query\n        .cfi_startproc\n# %bb.0:                                # %entry\n        pushq   %rbp\n        .cfi_def_cfa_offset 16\n        pushq   %rbx\n        .cfi_def_cfa_offset 24\n        .cfi_offset %rbx, -24\n        .cfi_offset %rbp, -16\n        movq    (%rdi), %r8\n        movq    8(%rdi), %r9\n        movq    16(%rdi), %r10\n        movq    24(%rdi), %r11\n        movq    $-8192, %rax            # imm = 0xE000\n        .p2align        4, 0x90\n.LBB0_1:                                # %loop\n                                        # =\u003eThis Inner Loop Header: Depth=1\n        movl    8192(%r8,%rax), %ecx\n        movl    8192(%r9,%rax), %edx\n        movl    8192(%r10,%rax), %edi\n        movl    8192(%r11,%rax), %ebx\n        movl    %edi, %ebp\n        xorl    %ebx, %ebp\n        xorl    %ecx, %ebx\n        andl    %edx, %ecx\n        orl     %edi, %ebp\n        andl    %edx, %edi\n        orl     %ebp, %edi\n        orl     %edi, %ebx\n        orl     %ecx, %ebx\n        movl    %ebx, 8192(%rsi,%rax)\n        addq    $4, %rax\n        jne     .LBB0_1\n# %bb.2:                                # %after_loop\n        popq    %rbx\n        .cfi_def_cfa_offset 16\n        popq    %rbp\n        .cfi_def_cfa_offset 8\n        retq\n.Lfunc_end0:\n        .size   query, .Lfunc_end0-query\n        .cfi_endproc\n                                        # -- End function\n\n        .section        \".note.GNU-stack\",\"\",@progbits\n\n```\n\nThis code is still not fully optimized, `opt` is used for this.\n\n```objdump\n# tools/jitmap-ir \"(a \u0026 b) | (c \u0026 c) | (c ^ d) | (c \u0026 b) | (d ^ a)\" | opt -O3 -S -mcpu=core-avx2| llc -O3\nninja: no work to do.\n        .text\n        .file   \"jitmap_ir\"\n        .section        .rodata.cst8,\"aM\",@progbits,8\n        .p2align        3               # -- Begin function query\n.LCPI0_0:\n        .quad   8192                    # 0x2000\n.LCPI0_1:\n        .quad   -9223372036854775808    # 0x8000000000000000\n        .text\n        .globl  query\n        .p2align        4, 0x90\n        .type   query,@function\nquery:                                  # @query\n# %bb.0:                                # %entry\n        pushq   %rbp\n        pushq   %r15\n        pushq   %r14\n        pushq   %r12\n        pushq   %rbx\n#  ...\n# And the holy grail fully vectorized loop\n.LBB0_2:                                # %vector.body\n                                        # =\u003eThis Inner Loop Header: Depth=1\n        vmovdqu (%r14,%rbx), %ymm0\n        vmovdqu 32(%r14,%rbx), %ymm1\n        vmovdqu (%r12,%rbx), %ymm2\n        vmovdqu 32(%r12,%rbx), %ymm3\n        vmovdqu (%rdi,%rbx), %ymm4\n        vmovdqu 32(%rdi,%rbx), %ymm5\n        vpand   (%r15,%rbx), %ymm0, %ymm6\n        vpand   32(%r15,%rbx), %ymm1, %ymm7\n        vpor    %ymm2, %ymm6, %ymm6\n        vpor    %ymm3, %ymm7, %ymm7\n        vpxor   %ymm2, %ymm4, %ymm2\n        vpxor   %ymm3, %ymm5, %ymm3\n        vpxor   %ymm0, %ymm4, %ymm0\n        vpor    %ymm0, %ymm2, %ymm0\n        vpor    %ymm0, %ymm6, %ymm0\n        vpxor   %ymm1, %ymm5, %ymm1\n        vpor    %ymm1, %ymm3, %ymm1\n        vpor    %ymm1, %ymm7, %ymm1\n        vmovdqu %ymm0, (%rsi,%rbx)\n        vmovdqu %ymm1, 32(%rsi,%rbx)\n        addq    $64, %rbx\n        cmpq    $8192, %rbx             # imm = 0x2000\n        jne     .LBB0_2\n.LBB0_5:                                # %after_loop\n        popq    %rbx\n        popq    %r12\n        popq    %r14\n        popq    %r15\n        popq    %rbp\n                vzeroupper\n        retq\n.Lfunc_end0:\n        .size   query, .Lfunc_end0-query\n                                        # -- End function\n\n        .section        \".note.GNU-stack\",\"\",@progbits\n```\n\n## Symbols with linux's perf\n\nBy default, perf will not be able to recognize the generated functions since the\nsymbols are not available statically. Luckily, perf has two mechanisms for jit\nto register symbols. LLVM's jit use the jitdump [1] facility. At the time of\nwriting this, one needs to patch perf with [2], see commit 077a9b7bd1 for more\ninformation.\n\n```\n# The `-k1` is required for jitdump to work.\n$ perf record -k1 jitmap_benchmark\n\n# By default, the output will be useless, since each instruction will be shown\n# instead of grouped by symbols.\n$ perf report --stdio\n...\n# Overhead  Command          Shared Object        Symbol\n# ........  ...............  ...................  ....................................................................................\n#\n    29.09%  jitmap_benchmar  jitmap_benchmark     [.] jitmap::StaticBenchmark\u003cjitmap::IntersectionFunctor\u003c(jitmap::PopCountOption)1\u003e \u003e\n    20.08%  jitmap_benchmar  jitmap_benchmark     [.] jitmap::StaticBenchmark\u003cjitmap::IntersectionFunctor\u003c(jitmap::PopCountOption)0\u003e \u003e\n     1.78%  jitmap_benchmar  [JIT] tid 24013      [.] 0x00007f628c6cb045\n     1.61%  jitmap_benchmar  [JIT] tid 24013      [.] 0x00007f628c6cb053\n     1.59%  jitmap_benchmar  [JIT] tid 24013      [.] 0x00007f628c6cb197\n     1.55%  jitmap_benchmar  [JIT] tid 24013      [.] 0x00007f628c6cb126\n     1.51%  jitmap_benchmar  [JIT] tid 24013      [.] 0x00007f628c6cb035\n     1.39%  jitmap_benchmar  [JIT] tid 24013      [.] 0x00007f628c6cb027\n\n# We must process the generate perf.data file by injecting symbols name\n$ perf inject --jit -i perf.data -o perf.jit.data \u0026\u0026 mv perf.jit.data perf.data\n$ perf report --stdio\n...\n# Overhead  Command          Shared Object        Symbol\n# ........  ...............  ...................  ....................................................................................\n#\n    29.09%  jitmap_benchmar  jitmap_benchmark     [.] jitmap::StaticBenchmark\u003cjitmap::IntersectionFunctor\u003c(jitmap::PopCountOption)1\u003e \u003e\n    20.08%  jitmap_benchmar  jitmap_benchmark     [.] jitmap::StaticBenchmark\u003cjitmap::IntersectionFunctor\u003c(jitmap::PopCountOption)0\u003e \u003e\n     6.48%  jitmap_benchmar  jitted-24013-16.so   [.] and_2_popcount\n     6.46%  jitmap_benchmar  jitted-24013-32.so   [.] and_4_popcount\n     6.42%  jitmap_benchmar  jitted-24013-46.so   [.] and_8_popcount\n     6.19%  jitmap_benchmar  jitted-24013-77.so   [.] and_4\n     6.19%  jitmap_benchmar  jitted-24013-61.so   [.] and_2\n     4.59%  jitmap_benchmar  jitted-24013-91.so   [.] and_8\n```\n\n[1] https://elixir.bootlin.com/linux/v4.10/source/tools/perf/Documentation/jitdump-specification.txt\n\n[2] https://lore.kernel.org/lkml/20191003105716.GB23291@krava/T/#u\n\n# TODO\n\n* Supports dynamic sized bitmaps\n* Implement roaring-bitmap-like compressed bitmaps\n* Get https://reviews.llvm.org/D67383 approved and merged to benefit from\n  Tree-Height-Reduction pass.\n* Provide a C front-end api.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffsaintjacques%2Fjitmap","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffsaintjacques%2Fjitmap","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffsaintjacques%2Fjitmap/lists"}