{"id":23362796,"url":"https://github.com/fast-pack/simdcomp","last_synced_at":"2025-12-18T12:51:30.834Z","repository":{"id":13858640,"uuid":"16556373","full_name":"fast-pack/simdcomp","owner":"fast-pack","description":"A simple C library for compressing lists of integers using binary packing","archived":false,"fork":false,"pushed_at":"2023-08-18T18:34:44.000Z","size":607,"stargazers_count":506,"open_issues_count":6,"forks_count":54,"subscribers_count":29,"default_branch":"master","last_synced_at":"2025-09-30T13:40:12.414Z","etag":null,"topics":["c","compression","simd","simd-instructions"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fast-pack.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2014-02-05T19:57:38.000Z","updated_at":"2025-08-27T12:05:33.000Z","dependencies_parsed_at":"2024-01-13T17:29:55.756Z","dependency_job_id":null,"html_url":"https://github.com/fast-pack/simdcomp","commit_stats":null,"previous_names":["fast-pack/simdcomp","lemire/simdcomp"],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/fast-pack/simdcomp","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fast-pack%2Fsimdcomp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fast-pack%2Fsimdcomp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fast-pack%2Fsimdcomp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fast-pack%2Fsimdcomp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fast-pack","download_url":"https://codeload.github.com/fast-pack/simdcomp/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fast-pack%2Fsimdcomp/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279004166,"owners_count":26083688,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-10T02:00:06.843Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","compression","simd","simd-instructions"],"created_at":"2024-12-21T12:15:29.179Z","updated_at":"2025-10-10T14:35:13.592Z","avatar_url":"https://github.com/fast-pack.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"The SIMDComp library\n====================\n[![Build Status](https://img.shields.io/appveyor/ci/lemire/simdcomp.svg)](https://ci.appveyor.com/project/lemire/simdcomp)\n\n\nA simple C library for compressing lists of integers using binary packing and SIMD instructions.\nThe assumption is either that you have a list of 32-bit integers where most of them are small, or a list of 32-bit integers where differences between successive integers are small. No software is able to reliably compress an array of 32-bit random numbers.\n\nThis library can decode at least 4 billions of compressed integers per second on most\ndesktop or laptop processors. That is, it can decompress data at a rate of 15 GB/s.\nThis is significantly faster than generic codecs like gzip, LZO, Snappy or LZ4.\n\nOn a Skylake Intel processor, it can decode integers at a rate 0.3 cycles per integer,\nwhich can easily translate into more than 8 decoded billions integers per second.\n\nThis library is part of the [Awesome C](https://github.com/kozross/awesome-c) list of C resources.\n\nContributors: Daniel Lemire, Nathan Kurz, Christoph Rupp, Anatol Belski, Nick White and others\n\nWhat is it for?\n-------------\n\nThis is a low-level library for fast integer compression. By design it does not define a compressed\nformat. It is up to the (sophisticated) user to create a compressed format.\n\nIt is used by:\n- [upscaledb](https://github.com/cruppstahl/upscaledb)\n- [EventQL](https://github.com/eventql/eventql)\n- [ManticoreSearch](https://manticoresearch.com)\n\n\n\nRequirements\n-------------\n\n- Your processor should support SSE4.1 (It is supported by most Intel and AMD processors released since 2008.)\n- It is possible to build the core part of the code if your processor support SSE2 (Pentium4 or better)\n- C99 compliant compiler (GCC is assumed)\n- A Linux-like distribution is assumed by the makefile\n\nFor a plain C version that does not use SIMD instructions, see https://github.com/lemire/LittleIntPacker\n\nUsage\n-------\n\nCompression works over blocks of 128 integers.\n\nFor a complete working example, see example.c (you can build it and\nrun it with \"make example; ./example\").\n\n\n\n1) Lists of integers in random order.\n\n```C            \nconst uint32_t b = maxbits(datain);// computes bit width\nsimdpackwithoutmask(datain, buffer, b);//compressed to buffer, compressing 128 32-bit integers down to b*32 bytes\nsimdunpack(buffer, backbuffer, b);//uncompressed to backbuffer\n```\n\nWhile 128 32-bit integers are read, only b 128-bit words are written. Thus, the compression ratio is 32/b.\n\n2) Sorted lists of integers.\n\nWe used differential coding: we store the difference between successive integers. For this purpose, we need an initial value (called offset).\n\n```C            \nuint32_t offset = 0;\nuint32_t b1 = simdmaxbitsd1(offset,datain); // bit width\nsimdpackwithoutmaskd1(offset, datain, buffer, b1);//compressing 128 32-bit integers down to b1*32 bytes\nsimdunpackd1(offset, buffer, backbuffer, b1);//uncompressed\n```\n\nGeneral example for arrays of arbitrary length:\n```C\nint compress_decompress_demo() {\n  size_t k, N = 9999;\n  __m128i * endofbuf;\n  uint32_t * datain = malloc(N * sizeof(uint32_t));\n  uint8_t * buffer;\n  uint32_t * backbuffer = malloc(N * sizeof(uint32_t));\n  uint32_t b;\n\n  for (k = 0; k \u003c N; ++k){        /* start with k=0, not k=1! */\n    datain[k] = k;\n  }\n\n  b = maxbits_length(datain, N);\n  buffer = malloc(simdpack_compressedbytes(N,b)); // allocate just enough memory\n  endofbuf = simdpack_length(datain, N, (__m128i *)buffer, b);\n  /* compressed data is stored between buffer and endofbuf using (endofbuf-buffer)*sizeof(__m128i) bytes */\n  /* would be safe to do : buffer = realloc(buffer,(endofbuf-(__m128i *)buffer)*sizeof(__m128i)); */\n  simdunpack_length((const __m128i *)buffer, N, backbuffer, b);\n\n  for (k = 0; k \u003c N; ++k){\n    if(datain[k] != backbuffer[k]) {\n      printf(\"bug\\n\");\n      return -1;\n    }\n  }\n  return 0;\n}\n```\n\n\n3) Frame-of-Reference \n\nWe also have frame-of-reference (FOR) functions (see simdfor.h header). They work like the bit packing\nroutines, but do not use differential coding so they allow faster search in some cases, at the expense\nof compression.\n\nSetup\n---------\n\n\nmake\nmake test\n\nand if you are daring:\n\nmake install\n\nGo\n--------\n\nIf you are a go user, there is a \"go\" folder where you will find a simple demo.\n\nOther libraries\n----------------\n* Fast integer compression in Go: https://github.com/ronanh/intcomp\n* Fast Bitpacking algorithms: Rust port of simdcomp https://github.com/quickwit-oss/bitpacking\n* SIMDCompressionAndIntersection: A C++ library to compress and intersect sorted lists of integers using SIMD instructions https://github.com/lemire/SIMDCompressionAndIntersection\n* The FastPFOR C++ library : Fast integer compression https://github.com/lemire/FastPFor\n* High-performance dictionary coding https://github.com/lemire/dictionary\n* LittleIntPacker: C library to pack and unpack short arrays of integers as fast as possible https://github.com/lemire/LittleIntPacker\n* StreamVByte: Fast integer compression in C using the StreamVByte codec https://github.com/lemire/streamvbyte\n* MaskedVByte: Fast decoder for VByte-compressed integers https://github.com/lemire/MaskedVByte\n* CSharpFastPFOR: A C#  integer compression library  https://github.com/Genbox/CSharpFastPFOR\n* JavaFastPFOR: A java integer compression library https://github.com/lemire/JavaFastPFOR\n* Encoding: Integer Compression Libraries for Go https://github.com/zhenjl/encoding\n* FrameOfReference is a C++ library dedicated to frame-of-reference (FOR) compression: https://github.com/lemire/FrameOfReference\n* libvbyte: A fast implementation for varbyte 32bit/64bit integer compression https://github.com/cruppstahl/libvbyte\n* TurboPFor is a C library that offers lots of interesting optimizations. Well worth checking! (GPL license) https://github.com/powturbo/TurboPFor\n* Oroch is a C++ library that offers a usable API (MIT license) https://github.com/ademakov/Oroch\n\n\nOther programming languages\n-------------\n\n- [There is a wrapper for Julia](https://github.com/mcovalt/TinyInt.jl).\n- [There is a Rust port](https://github.com/tantivy-search/bitpacking/).\n\nReferences\n------------\n* Daniel Lemire, Nathan Kurz, Christoph Rupp, Stream VByte: Faster Byte-Oriented Integer Compression, Information Processing Letters, Information Processing Letters 130, February 2018, Pages 1-6https://arxiv.org/abs/1709.08990\n* Jianguo Wang, Chunbin Lin, Yannis Papakonstantinou, Steven Swanson, An Experimental Study of Bitmap Compression vs. Inverted List Compression, SIGMOD 2017 http://db.ucsd.edu/wp-content/uploads/2017/03/sidm338-wangA.pdf\n* P. Damme, D. Habich, J. Hildebrandt, W. Lehner, Lightweight Data Compression Algorithms: An Experimental Survey (Experiments and Analyses), EDBT 2017 http://openproceedings.org/2017/conf/edbt/paper-146.pdf\n* P. Damme, D. Habich, J. Hildebrandt, W. Lehner, Insights into the Comparative Evaluation of Lightweight Data Compression Algorithms, EDBT 2017 http://openproceedings.org/2017/conf/edbt/paper-414.pdf\n* Daniel Lemire, Leonid Boytsov, Nathan Kurz, SIMD Compression and the Intersection of Sorted Integers, Software Practice \u0026 Experience 46 (6) 2016. http://arxiv.org/abs/1401.6399\n* Daniel Lemire and Leonid Boytsov, Decoding billions of integers per second through vectorization, Software Practice \u0026 Experience 45 (1), 2015.  http://arxiv.org/abs/1209.2137 http://onlinelibrary.wiley.com/doi/10.1002/spe.2203/abstract\n* Jeff Plaisance, Nathan Kurz, Daniel Lemire, Vectorized VByte Decoding, International Symposium on Web Algorithms 2015, 2015. http://arxiv.org/abs/1503.07387\n* Wayne Xin Zhao, Xudong Zhang, Daniel Lemire, Dongdong Shan, Jian-Yun Nie, Hongfei Yan, Ji-Rong Wen, A General SIMD-based Approach to Accelerating Compression Algorithms, ACM Transactions on Information Systems 33 (3), 2015. http://arxiv.org/abs/1502.01916\n* T. D. Wu, Bitpacking techniques for indexing genomes: I. Hash tables, Algorithms for Molecular Biology 11 (5), 2016. http://almob.biomedcentral.com/articles/10.1186/s13015-016-0069-5\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffast-pack%2Fsimdcomp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffast-pack%2Fsimdcomp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffast-pack%2Fsimdcomp/lists"}