{"id":16572707,"url":"https://github.com/fast-pack/JavaFastPFOR","last_synced_at":"2025-08-09T13:31:00.539Z","repository":{"id":3846940,"uuid":"4931095","full_name":"lemire/JavaFastPFOR","owner":"lemire","description":"A simple integer compression library in Java ","archived":false,"fork":false,"pushed_at":"2024-06-19T14:45:55.000Z","size":3157,"stargazers_count":538,"open_issues_count":6,"forks_count":62,"subscribers_count":40,"default_branch":"master","last_synced_at":"2024-11-27T18:07:32.886Z","etag":null,"topics":["compression","java"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lemire.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2012-07-06T21:05:45.000Z","updated_at":"2024-11-09T12:29:11.000Z","dependencies_parsed_at":"2024-01-05T20:45:51.074Z","dependency_job_id":"fb819d95-6f50-4cce-b7ca-670e6bfabc32","html_url":"https://github.com/lemire/JavaFastPFOR","commit_stats":{"total_commits":352,"total_committers":13,"mean_commits":"27.076923076923077","dds":0.3039772727272727,"last_synced_commit":"56f86a6c3a903735aaa410a5a9d826b3df4964d3"},"previous_names":[],"tags_count":28,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lemire%2FJavaFastPFOR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lemire%2FJavaFastPFOR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lemire%2FJavaFastPFOR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lemire%2FJavaFastPFOR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lemire","download_url":"https://codeload.github.com/lemire/JavaFastPFOR/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228175097,"owners_count":17880488,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compression","java"],"created_at":"2024-10-11T21:28:20.088Z","updated_at":"2025-08-09T13:30:54.618Z","avatar_url":"https://github.com/lemire.png","language":"Java","readme":"JavaFastPFOR: A simple integer compression library in Java \n==========================================================\n [![][maven img]][maven] [![][license img]][license] [![docs-badge][]][docs]\n[![Java CI](https://github.com/lemire/JavaFastPFOR/actions/workflows/basic.yml/badge.svg)](https://github.com/lemire/JavaFastPFOR/actions/workflows/basic.yml)\n\n\nLicense\n-------\n\nThis code is released under the\nApache License Version 2.0 http://www.apache.org/licenses/.\n\n\nWhat does this do?\n------------------\n\nIt is a library to compress and uncompress arrays of integers \nvery fast. The assumption is that most (but not all) values in\nyour array use much less than 32 bits, or that the gaps between\nthe integers use much less than 32 bits. These sort of arrays often come up\nwhen using differential coding in databases and information\nretrieval (e.g., in inverted indexes or column stores).\n\nPlease note that random integers are not compressible, by this\nlibrary or by any other means. If you ever had the means of\nsystematically compressing random integers, you could compress\nany data source to nothing, by recursive application of your technique. \n\nThis library can decompress integers at a rate of over 1.2 billions per second\n(4.5 GB/s). It is significantly faster than generic codecs (such\nas Snappy, LZ4 and so on) when compressing arrays of integers.\n\nThe library is used in [LinkedIn Pinot](https://github.com/linkedin/pinot), a realtime distributed OLAP datastore.\nPart of this library has been integrated in Parquet (http://parquet.io/).\nA modified version of the library is included in the search engine \nTerrier (http://terrier.org/). This libary is used by ClueWeb \nTools (https://github.com/lintool/clueweb). It is also used by [Apache NiFi](https://nifi.apache.org).\n\nThis library inspired a compression scheme used by Apache Lucene and Apache Lucene.NET (e.g., see\nhttp://lucene.apache.org/core/4_6_1/core/org/apache/lucene/util/PForDeltaDocIdSet.html ).\n\nIt is a java port of the fastpfor C++ library (https://github.com/lemire/FastPFor). \nThere is also a Go port (https://github.com/reducedb/encoding). The C++\nlibrary is used by the zsearch engine (http://victorparmar.github.com/zsearch/)\nas well as in GMAP and GSNAP (http://research-pub.gene.com/gmap/).\n\n\nUsage\n------\n\nReally simple usage:\n\n```java\n        IntegratedIntCompressor iic = new IntegratedIntCompressor();\n        int[] data = ... ; // to be compressed\n        int[] compressed = iic.compress(data); // compressed array\n        int[] recov = iic.uncompress(compressed); // equals to data\n```\n\nFor more examples, see example.java or the examples folder.\n\nJavaFastPFOR supports compressing and uncompressing data in chunks (e.g., see ``advancedExample`` in [https://github.com/lemire/JavaFastPFOR/blob/master/example.java](example.java)).\n\nSome CODECs (\"integrated codecs\") assume that the integers are\nin sorted orders and use differential coding (they compress deltas). \nThey can be found in the package me.lemire.integercompression.differential.\nMost others do not.\n\nThe Java Team at Intel (R) introduced the vector implementation for FastPFOR\nbased on the Java Vector API that showed significant gains over the\nnon-vectorized implementation. For an example usage, see\nexamples/vector/Example.java. The feature requires JDK 19+ and is currently for \nadvanced users.\n\nMaven central repository\n------------------------\n\nUsing this code in your own project is easy with maven, just add\nthe following code in your pom.xml file:\n\n```xml\n    \u003cdependencies\u003e\n         \u003cdependency\u003e\n\t     \u003cgroupId\u003eme.lemire.integercompression\u003c/groupId\u003e\n\t     \u003cartifactId\u003eJavaFastPFOR\u003c/artifactId\u003e\n\t     \u003cversion\u003e[0.2,)\u003c/version\u003e\n         \u003c/dependency\u003e\n     \u003c/dependencies\u003e\n```\n\nNaturally, you should replace \"version\" by the version\nyou desire.\n\n\n\nYou can also download JavaFastPFOR from the Maven central repository:\nhttp://repo1.maven.org/maven2/me/lemire/integercompression/JavaFastPFOR/\n\n\nWhy?\n----\n\nWe found no library that implemented state-of-the-art integer coding techniques\nsuch as Binary Packing, NewPFD, OptPFD, Variable Byte, Simple 9 and so on in Java.\nWe wrote one. \n\nThread safety \n----\n\nSome codecs are thread-safe while others are not.\nFor this reason, it is best to use one codec per thread.\nThe memory usage of a codec instance is small in any case.\n\nNevertheless, if you want to reuse codec instances, \nnote that by convention, unless the documentation of a codec specify\nthat it is not thread-safe, then it can be assumed to be thread-safe.\n\nAuthors\n-------\n\nMain contributors\n* Daniel Lemire, http://lemire.me/en/\n* Muraoka Taro, https://github.com/koron\n\nwith contributions by \n* the Terrier team (Matteo Catena, Craig Macdonald, Saúl Vargas and Iadh Ounis)\n* Di Wu, http://www.facebook.com/diwu1989\n* Stefan Ackermann, https://github.com/Stivo\n* Samit Roy, https://github.com/roysamit\n* Mulugeta Mammo, https://github.com/mulugetam (for VectorFastPFOR)\n\nHow does it compare to the Kamikaze PForDelta library?\n------------------------------------------------------\n\nIn our tests, Kamikaze PForDelta is slower than our implementations. See\nthe benchmarkresults directory for some results. \n\nhttps://github.com/lemire/JavaFastPFOR/blob/master/benchmarkresults/benchmarkresults_icore7_10may2013.txt\n\n\nReference:\n http://sna-projects.com/kamikaze/\n\n\n\nRequirements\n------------\n\nReleases up to 0.1.12 require Java 7 or better.\n\nThe current development versions assume JDK 11 or better.\n\n\n\nHow fast is it?\n---------------\n\nCompile the code and execute `me.lemire.integercompression.benchmarktools.Benchmark`.\n\nSpeed is always reported in millions of integers per second.\n\n\nFor Maven users\n---------------\n\n\n```\nmvn compile\nmvn exec:java\n```\n\nYou may run our examples as follows:\n\n```\nmvn package\njavac -cp target/classes/:. example.java\njava -cp target/classes/:. example\n```\n\nFor ant users (legacy, currently untested)\n-------------\n\nIf you use Apache ant, please try this:\n\n    $ ant Benchmark\n\nor:\n\n    $ ant Benchmark -Dbenchmark.target=BenchmarkBitPacking\n\n\nAPI Documentation\n-----------------\n\nhttp://www.javadoc.io/doc/me.lemire.integercompression/JavaFastPFOR/\n\nWant to read more?\n------------------\n\nThis library was a key ingredient in the best paper at ECIR 2014 :\n\nMatteo Catena, Craig Macdonald, Iadh Ounis, On Inverted Index Compression for Search Engine Efficiency,  Lecture Notes in Computer Science 8416 (ECIR 2014), 2014.\nhttp://dx.doi.org/10.1007/978-3-319-06028-6_30\n\nWe wrote several research papers documenting many of the CODECs implemented here:\n\n* Daniel Lemire, Nathan Kurz, Christoph Rupp, Stream VByte: Faster Byte-Oriented Integer Compression, Information Processing Letters (to appear) https://arxiv.org/abs/1709.08990\n* Daniel Lemire, Leonid Boytsov, Nathan Kurz, SIMD Compression and the Intersection of Sorted Integers, Software Practice \u0026 Experience Volume 46, Issue 6, pages 723-749, June 2016 http://arxiv.org/abs/1401.6399\n* Daniel Lemire and Leonid Boytsov, Decoding billions of integers per second through vectorization, Software Practice \u0026 Experience 45 (1), 2015.  http://arxiv.org/abs/1209.2137 http://onlinelibrary.wiley.com/doi/10.1002/spe.2203/abstract\n* Jeff Plaisance, Nathan Kurz, Daniel Lemire, Vectorized VByte Decoding, International Symposium on Web Algorithms 2015, 2015. http://arxiv.org/abs/1503.07387\n* Wayne Xin Zhao, Xudong Zhang, Daniel Lemire, Dongdong Shan, Jian-Yun Nie, Hongfei Yan, Ji-Rong Wen, A General SIMD-based Approach to Accelerating Compression Algorithms, ACM Transactions on Information Systems 33 (3), 2015. http://arxiv.org/abs/1502.01916\n\n\nIkhtear Sharif wrote his M.Sc. thesis on this library:\n\nIkhtear Sharif, Performance Evaluation of Fast Integer Compression Techniques Over Tables, M.Sc. thesis, UNB 2013.\nhttps://unbscholar.lib.unb.ca/islandora/object/unbscholar%3A9399/datastream/PDF/view\n\nHe also posted his slides online: http://www.slideshare.net/ikhtearSharif/ikhtear-defense\n\nOther recommended libraries\n-----------------------------\n\n* Fast integer compression in Go: https://github.com/ronanh/intcomp\n* Encoding: Integer Compression Libraries for Go https://github.com/zhenjl/encoding\n* CSharpFastPFOR: A C#  integer compression library  https://github.com/Genbox/CSharpFastPFOR\n* TurboPFor is a C library that offers lots of interesting optimizations and Java wrappers. Well worth checking! (Uses a GPL license.) https://github.com/powturbo/TurboPFor\n\nFunding\n-----------\n\nThis work was supported by NSERC grant number 26143.\n\n\n[maven img]:https://maven-badges.herokuapp.com/maven-central/me.lemire.integercompression/JavaFastPFOR/badge.svg\n[maven]:http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22me.lemire.integercompression%22%20\n\n[license]:LICENSE\n[license img]:https://img.shields.io/badge/License-Apache%202-blue.svg\n\n[docs-badge]:https://img.shields.io/badge/API-docs-blue.svg?style=flat-square\n[docs]:http://www.javadoc.io/doc/me.lemire.integercompression/JavaFastPFOR/\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffast-pack%2FJavaFastPFOR","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffast-pack%2FJavaFastPFOR","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffast-pack%2FJavaFastPFOR/lists"}