{"id":13632486,"url":"https://github.com/quickwit-oss/bitpacking","last_synced_at":"2025-05-15T14:06:09.107Z","repository":{"id":41878917,"uuid":"125035134","full_name":"quickwit-oss/bitpacking","owner":"quickwit-oss","description":"SIMD algorithms for integer compression via bitpacking. This crate is a port of a C library called simdcomp.","archived":false,"fork":false,"pushed_at":"2024-06-13T04:01:30.000Z","size":493,"stargazers_count":303,"open_issues_count":9,"forks_count":34,"subscribers_count":18,"default_branch":"master","last_synced_at":"2025-05-11T09:01:38.917Z","etag":null,"topics":["compression","rust","simd"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/quickwit-oss.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"fulmicoton","patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"custom":null}},"created_at":"2018-03-13T10:38:40.000Z","updated_at":"2025-05-01T14:28:18.000Z","dependencies_parsed_at":"2024-06-14T02:04:09.074Z","dependency_job_id":null,"html_url":"https://github.com/quickwit-oss/bitpacking","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quickwit-oss%2Fbitpacking","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quickwit-oss%2Fbitpacking/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quickwit-oss%2Fbitpacking/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quickwit-oss%2Fbitpacking/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/quickwit-oss","download_url":"https://codeload.github.com/quickwit-oss/bitpacking/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254355335,"owners_count":22057354,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compression","rust","simd"],"created_at":"2024-08-01T22:03:04.583Z","updated_at":"2025-05-15T14:06:04.068Z","avatar_url":"https://github.com/quickwit-oss.png","language":"Rust","funding_links":["https://github.com/sponsors/fulmicoton"],"categories":["Rust"],"sub_categories":[],"readme":"# Fast Bitpacking algorithms\n\n[![docs.rs docs](https://docs.rs/bitpacking/badge.svg)](https://docs.rs/bitpacking)\n[![GitHub](https://img.shields.io/badge/github-quickwit--oss/bitpacking-8da0cb?logo=github)](https://github.com/quickwit-oss/bitpacking)\n[![crates.io version](https://img.shields.io/crates/v/bitpacking.svg)](https://crates.io/crates/bitpacking)\n[![Linux build status](https://travis-ci.org/tantivy-search/bitpacking.svg?branch=master)](https://travis-ci.org/tantivy-search/bitpacking)\n\nThis crate is a **Rust port of [Daniel Lemire's simdcomp C library](https://github.com/lemire/simdcomp)**.\n\nIt makes it possible to compress/decompress :\n- sequence of small integers\n- sequences of increasing integers\n\n:star: It is fast. Expect \u003e 4 billions integers per seconds.\n\n\n## How to compile ?\n\n`bitpacking` compiles on stable rust but require rust \u003e 1.27 to compile.\n\nJust add to your `Cargo.toml` :\n\n```toml\nbitpacking = \"0.5\"\n```\n\nFor some bitpacking flavor and for some platform, the bitpacking crate\nmay benefit from some specific simd instruction set.\n\nIn this case, it will always ship an alternative scalar implementation and will\nfall back to the scalar implementation at runtime.\n\nIn other words, your do not need to configure anything. Your program will run correctly,\nand at the fastest speed available for your CPU.\n\n\n\n## Documentation\n\n[Reference documentation](https://docs.rs/bitpacking/)\n\n## What is bitpacking ?\n\nTraditional compression schemes like LZ4 are not really suited to address this problem efficiently.\nInstead, there are different families of solutions to this problem.\n\nOne of the most straightforward and efficient ones is `bitpacking` :\n- Integers are first grouped into blocks of constant size (e.g. `128` when using the SSE2 implementation).\n- If not available implicitly, compute the minimum number of bits `b` that makes it possible to represent all the integers.\nIn other words, the smallest `b` such that all integers in the block are stricly smaller than 2\u003csup\u003eb\u003c/sup\u003e.\n- The bitpacked representation is then some variation of the concatenation of the integers restricted to their least significant `b`-bits.\n\nFor instance, assuming a block of `4`, when encoding `4, 9, 3, 2`. Assuming that the highest value in the block is 9, `b = 4`. All values will then be encoded over 4 bits as follows.\n\n| original number | binary representation |\n|:----------------|:----------------------|\n| 4               | 0100                  |\n| 9               | 1001                  |\n| 3               | 0011                  |\n| 2               | 0010                  |\n| ...             | ...                   |\n\n\nAs a result, each integer of this block will only require 4 bits.\n\n## Choosing between BitPacker1x, BitPacker4x and BitPacker8x.\n\n:warning: `BitPacker1x`, `BitPacker4x`, and `BitPacker8x` produce different formats,\nand are incompatible one with another.\n\n`BitPacker4x` and `BitPacker8x` are designed specifically to leverage `SSE3` and `AVX2`\ninstructions respectively.\n\nIt will safely fall back at runtime to a scalar implementation of these format if these instruction sets are not available on the running CPU.\n\n:ok_hand: I recommend using `BitPacker4x` if you are in doubt.\n\n### BitPacker1x\n\n`BitPacker1x` is what you would expect from a bitpacker.\nThe integer representation over `b` bits are simply concatenated one\nafter the other. One block must contain `32 integers`.\n\n### BitPacker4x\n\n`BitPacker4x` bits ordering works in layers of 4 integers. This gives an opportunity\nto leverage `SSE3` instructions to encode and decode the stream.\nOne block must contain `128 integers`.\n\n#### BitPacker8x\n\n`BitPacker8x` bits ordering works in layers of 8 integers. This gives an opportunity\nto leverage `AVX2` instructions to encode and decode the stream.\nOne block must contain `256 integers`.\n\n\n\n## Compressing small integers\n\n```rust\nuse bitpacking::{BitPacker4x, BitPacker};\n\nfn main() {\n    let my_data: Vec\u003cu32\u003e = vec![7, 7, 7, 7, 11, 10, 15, 13, 6, 5, 3, 14, 5, 7,\n        15, 12, 1, 10, 8, 10, 12, 14, 13, 1, 10, 1, 1, 10, 4, 15, 12,\n        1, 2, 0, 8, 5, 14, 5, 2, 4, 1, 6, 14, 13, 5, 10, 10, 1, 6, 4,\n        1, 12, 1, 1, 5, 15, 15, 2, 8, 6, 4, 3, 10, 8, 8, 9, 2, 6, 10,\n        5, 7, 9, 0, 13, 15, 5, 13, 10, 0, 2, 10, 14, 5, 9, 12, 8, 5, 10,\n        8, 8, 10, 5, 13, 8, 11, 14, 7, 14, 4, 2, 9, 12, 14, 5, 15, 12, 0,\n        12, 13, 3, 13, 5, 4, 15, 9, 8, 9, 3, 3, 3, 1, 12, 0, 6, 11, 11, 12, 4];\n\n    // Detects if `SSE3` is available on the current computed\n    // and uses the best available implementation accordingly.\n    let bitpacker = BitPacker4x::new();\n\n    // Computes the number of bits used for each integer in the blocks.\n    // my_data is assumed to have a len of 128 for `BitPacker4x`.\n    let num_bits: u8 = bitpacker.num_bits(\u0026my_data);\n    assert_eq!(num_bits, 4);\n\n    // The compressed array will take exactly `num_bits * BitPacker4x::BLOCK_LEN / 8`.\n    // But it is ok to have an output with a different len as long as it is larger\n    // than this.\n    let mut compressed = vec![0u8; 4 * BitPacker4x::BLOCK_LEN];\n\n    // Compress returns the len.\n    let compressed_len = bitpacker.compress(\u0026my_data, \u0026mut compressed[..], num_bits);\n\n    assert_eq!((num_bits as usize) * BitPacker4x::BLOCK_LEN / 8, compressed_len);\n\n    // Decompressing\n    let mut decompressed = vec![0u32; BitPacker4x::BLOCK_LEN];\n    bitpacker.decompress(\u0026compressed[..compressed_len], \u0026mut decompressed[..], num_bits);\n\n    assert_eq!(\u0026my_data, \u0026decompressed);\n}\n```\n\n## Benchmark\n\nThe following benchmarks have been run on one thread on my laptop's CPU:\nIntel(R) Core(TM) i5-8250U CPU @ 1.60GHz.\n\nYou can get accurate figures on your hardware by running the following command.\n\n```bash\ncargo bench\n```\n\n### BitPacker1x\n\n| operation        | throughput           |\n|:-----------------|:---------------------|\n| compress         | 1.4 billions int/s   |\n| compress_delta   | 1.0 billions int/s   |\n| decompress       | 1.8 billions int/s   |\n| decompress_delta | 1.4 billions int/s   |\n\n## BitPacker4x (assuming SSE3 instructions are available)\n\n| operation        | throughput         |\n|:-----------------|:-------------------|\n| compress         | 5.3 billions int/s |\n| compress_delta   | 2.8 billions int/s |\n| decompress       | 5.5 billions int/s |\n| decompress_delta | 5 billions int/s   |\n\n## BitPacker8x (assuming AVX2 instructions are available)\n\n| operation        | throughput         |\n|:-----------------|:-------------------|\n| compress         | 7 billions int/s   |\n| compress_delta   | 600 millions int/s |\n| decompress       | 6.5 billions int/s |\n| decompress_delta | 5.6 billions int/s |\n\n\n## Reference\n\n- [SIMD Compression and the Intersection of Sorted Integers](https://arxiv.org/abs/1401.6399)\n\n## Other crates you might want to check out\n\n- [stream vbyte](https://crates.io/crates/stream-vbyte) A Stream-VByte implementation\n- [mayda](https://github.com/fralalonde/mayda) Another crate implementation the same algorithms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquickwit-oss%2Fbitpacking","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fquickwit-oss%2Fbitpacking","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquickwit-oss%2Fbitpacking/lists"}