{"id":18974547,"url":"https://github.com/m7a/bo-lz4-ada","last_synced_at":"2025-10-06T14:13:40.605Z","repository":{"id":164554574,"uuid":"594469597","full_name":"m7a/bo-lz4-ada","owner":"m7a","description":"Ada LZ4 Extractor Library","archived":false,"fork":false,"pushed_at":"2024-04-28T19:30:22.000Z","size":9374,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-06-02T13:17:45.697Z","etag":null,"topics":["ada","lz4","lz4-frame"],"latest_commit_sha":null,"homepage":"https://masysma.net/32/lz4_ada.xhtml","language":"Ada","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/m7a.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-01-28T16:46:06.000Z","updated_at":"2024-04-28T19:30:26.000Z","dependencies_parsed_at":"2024-04-28T20:31:06.042Z","dependency_job_id":"9268ac81-c95e-4f32-90c6-454eba96b833","html_url":"https://github.com/m7a/bo-lz4-ada","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/m7a/bo-lz4-ada","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/m7a%2Fbo-lz4-ada","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/m7a%2Fbo-lz4-ada/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/m7a%2Fbo-lz4-ada/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/m7a%2Fbo-lz4-ada/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/m7a","download_url":"https://codeload.github.com/m7a/bo-lz4-ada/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/m7a%2Fbo-lz4-ada/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278621892,"owners_count":26017262,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-06T02:00:05.630Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ada","lz4","lz4-frame"],"created_at":"2024-11-08T15:15:23.696Z","updated_at":"2025-10-06T14:13:40.577Z","avatar_url":"https://github.com/m7a.png","language":"Ada","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\nsection: 32\nx-masysma-name: lz4_ada\ntitle: LZ4 Decompressor for Ada\ndate: 2022/12/14 20:49:20\nlang: en-US\nauthor: [\"Linux-Fan, Ma_Sys.ma (info@masysma.net)\"]\nkeywords: [\"lz4\", \"decompress\", \"ada\", \"library\"]\nx-masysma-version: 1.0.0\nx-masysma-website: https://masysma.net/32/lz4_ada.xhtml\nx-masysma-repository: https://www.github.com/m7a/bo-lz4-ada\nx-masysma-copyright: \"2022, 2023 Ma_Sys.ma \u003cinfo@masysma.net\u003e\"\n---\nAbstract\n========\n\nThis repository provides an Ada implementation of an LZ4 Decompressor\n(cf. \u003chttps://lz4.github.io/lz4/\u003e for general information on LZ4).\n\nThis implementation only supports decompression!\n\n * LZ4 Frame Format is supported according to specification v.1.6.3 (2022/09/12)\n * LZ4 Block Format is supported according to specification (2022/07/31)\n * Legacy Frame Format is supported.\n * Skippable Frames are supported.\n * Provided checksums are verified.\n * Provided length information is verified.\n * Concatenated frames can be decompressed.\n * Big Endian architectures are _UNSUPPORTED_.\n * Dictionaries are _UNSUPPORTED_.\n\nPerformance is slower compared to reference and C implementations: On my system,\nthe library decompresses slower than 1100 MiB/s. The Debian-supplied `unlz4`\nprogram for instance, achieves over 3000 MiB/s. See section _Performance_.\n\nLicense\n=======\n\nThis library is available under the Expat aka. MIT License.\nSee `LICENSE.txt` or `lz4ada.ads` for details.\n\nCompiling\n=========\n\nThe following dependencies are required for building:\n\n * Ada compiler (`gnat-10`, `gcc`)\n * Ant build tool (`ant`)\n\n## Compile\n\n\tant\n\n## Run Tests\n\nRunning the test requires the following standard tools: `bc`, `time`\n\n\tant\n\t./test_run.sh\n\nNote: When running tests like the supplied test suite directly from the console\n(rather than using the script), `ulimit -s 60000` to increase the stack size!\n\n## Install\n\nIt is advisable to generate a package by means of `ant package`. Alternatively,\ninstall the library directly using the following commands (or similar depending\non your OS):\n\n\tinstall -DsT lib/liblz4ada.so /usr/local/lib/x86_64-linux-gnu\n\tinstall -m 644 -DT lib/lz4ada.ali /usr/local/lib/x86_64-linux-gnu/ada/adalib/lz4\n\tinstall -m 644 -DT lib/lz4ada.ads /usr/local/share/ada/adainclude/lz4\n\nThe following instructions assume that files are below `/usr/lib` rather than\n`/usr/local/lib`. If you use the commands above, adapt accordingly.\n\nRepository Structure\n====================\n\nThis repository contains multiple subdirectories for the various components of\nthe library.\n\n~~~\n/bo-lz4-ada\n   │\n   ├── lib/\n   │    │\n   │    ├── lz4ada.adb                *** This is the implementation. ***\n   │    │\n   │    ├── lz4ada.ads                *** Implementation header file. ***\n   │    │\n   │    └── build.xml                 Build instructions\n   │\n   ├── test_suite/                    Test Suite for LZ4 Library\n   │    │\n   │    ├── lz4test.adb               Test Suite Implementation\n   │    │\n   │    └── build.xml                 Build, run and coverage instructions\n   │\n   ├── test_vectors_lz4/              Sample Data for Testing\n   │    │\n   │    ├── ....bin                   Original data\n   │    │\n   │    ├── ....lz4                   Compressee data\n   │    │\n   │    ├── ....err                   Invalid (manipulated) LZ4 data\n   │    │\n   │    └── ....eds                   Expected error messages for .err files\n   │\n   ├── tool_unlz4ada/\n   │    │\n   │    └── unlz4ada.adb              Example of explicitly handling frames.\n   │\n   ├── tool_lz4hdrinfo/\n   │    │\n   │    └── lz4hdrinfo.adb            Debugging tool to decode LZ4 frame header.\n   │\n   ├── tool_unlz4ada_simple/\n   │    │\n   │    └── unlz4ada_simple.adb       Simple usage example for the library API.\n   │\n   ├── tool_xxhash32ada/\n   │    │\n   │    └── xxhash32ada.adb           Auxiliary tool to demonstrate computing\n   │                                  the XXHash32 Hash function.\n   │\n   ├── test_benchmark.sh              Script to invoke a minimal benchmark.\n   │\n   ├── test_run.sh                    Script to test against the test vectors.\n   │\n   ├── README.md                      This file.\n   │\n   ├── LICENSE.txt                    Expat license.\n   │\n   └── build.xml                      Recursive antfile build instructions.\n~~~\n\nThe important subdirectory regarding the library is `lib`. If you do not\nneed tests or example programs, it is sufficient to compile and use only the\nfiles from that directory.\n\nSample Program\n==============\n\nUnfortunately, given the interesting property of decompression that a small\ninput can produce a larger output and given that the library is intended to\nachieve decent performance, setting up a minimal example is already nontrivial.\n\nThe following code `unlz4ada_simple.adb` demonstrates a fully-working\nLZ4 decompression using the library. Like the reference implementation, it\ndecompresses any number of concatenated frames. It has a larger memory footprint\nthan the more complex implementation provided in directory `tool_unlz4ada`.\n\n~~~{.ada}\nwith Ada.Text_IO;\nwith Ada.Text_IO.Text_Streams;\nwith Ada.Streams;\nuse  Ada.Streams;\nwith LZ4Ada;\n\nprocedure UnLZ4Ada_Simple is\n\t-- 1.\n\tStdin:  constant access Root_Stream_Type'Class :=\n\t\tAda.Text_IO.Text_Streams.Stream(Ada.Text_IO.Standard_Input);\n\tStdout: constant access Root_Stream_Type'Class :=\n\t\tAda.Text_IO.Text_Streams.Stream(Ada.Text_IO.Standard_Output);\n\n\t-- 2.\n\tBuf_In: Stream_Element_Array(0 .. 4095); -- 4k buffer\n\tBuf_Sz: Stream_Element_Offset;\n\tCtx:    LZ4Ada.Decompressor := LZ4Ada.Init(Buf_Sz);\n\n\tLast:           Stream_Element_Offset := -1;\n\tTotal_Consumed: Stream_Element_Offset := 0;\n\tOutput_Buffer:  Stream_Element_Array(1 .. Buf_Sz);\n\n\tConsumed, Output_First, Output_Last: Stream_Element_Offset;\nbegin\n\t-- 3.\n\tloop\n\t\tif Total_Consumed \u003e Last then\n\t\t\tRead(Stdin.all, Buf_In, Last);\n\t\t\texit when Last \u003c 0;\n\t\t\tTotal_Consumed := 0;\n\t\tend if;\n\t\tCtx.Update(Buf_In(Total_Consumed .. Last), Consumed,\n\t\t\t\tOutput_Buffer, Output_First, Output_Last);\n\t\tWrite(Stdout.all, Output_Buffer(Output_First .. Output_Last));\n\t\tTotal_Consumed := Total_Consumed + Consumed;\n\tend loop;\n\t-- 4.\n\tif LZ4Ada.\"=\"(Ctx.Is_End_Of_Frame, LZ4Ada.No) then\n\t\traise Constraint_Error with \"Input ended mid-frame.\";\n\tend if;\nend UnLZ4Ada_Simple;\n~~~\n\nHere is how the sample program works:\n\n 1. `Stdin` and `Stdout` allow accessing the respective streams for binary\n    input/output. `Buf_In` defines an input buffer with an arbitrary size.\n    In this example, a 4 KiB buffer is allocated.\n 2. The example makes use of the library API that allows initialization without\n    supplying any data. This comes at the cost of allocating the buffer large\n    enough to process the largest LZ4 blocks which means that two 8 MiB buffers\n    are needed: One inside the library (as input buffer) and one external\n    as output buffer.\n 3. The data can now be processed in a loop:\n     * If all buffered input data has been processed already\n       (`Total_Consumed \u003e Last` and initially) then new data is read from\n       `Stdin`.\n     * `Ctx.Update` is invoked with the current chunk of input data to process.\n     * `Write` is called to output any data decompressed by the `Update` call.\n 4. After processing, condition `Ctx.Is_End_Of_Frame = No` is checked because:\n    If this is not the end of frame, processing has ended mid-frame and the\n    output is most likely to be incomplete. An alternative exception that could\n    be raised here insted of `Constraint_Error` is `Data_Corruption` as supplied\n    by the library. The variant shown has the advantage of not needing to use\n    any of the custom library types/exceptions except for the `Decompressor`\n    itself.\n\nUsing the installed Library\n===========================\n\nAssuming the library is already installed on your system, you can compile\nand run the sample program from subdirectory `tool_unlz4ada` as follows:\n\n\tgnatmake -o unlz4ada unlz4ada.adb \\\n\t\t-aO/usr/lib/x86_64-linux-gnu/ada/adalib/lz4 \\\n\t\t-aI/usr/share/ada/adainclude/lz4 \\\n\t\t-largs -llz4ada\n\tulimit -s 60000\n\t./unlz4ada \u003c ../test_vectors_lz4/z1.lz4 | xxd\n\nOutput: `00000000: 00                                       .`\n\nUsing the Library without Installation\n======================================\n\nIf the library is not installed on your system, it can be integrated using\nmultiple different approaches.\n\n## Easy Vendoring\n\nThe quickest way to get started is to just include the `lz4ada.ads` and\n`lz4ada.adb` files into the source tree.\n\n\tcp ../lib/lz4ada.ad? .\n\nHere is what the directory structure may look like then:\n\n~~~\n  ...\n   │\n   ├── tool_unlz4ada/\n   │    └── lz4ada.adb\n   │    └── unlz4ada.adb\n   │    └── lz4ada.ads\n   │\n  ...\n~~~\n\nCompilation and invocation then become trivial:\n\n\tgnatmake -o unlz4ada unlz4ada.adb\n\tulimit -s 60000\n\t./unlz4ada \u003c ../test_vectors_lz4/z1.lz4 | xxd\n\nOutput: `00000000: 00                                       .`\n\n## Inclusion from different directory\n\nIt may not be suitable to just copy-over the files. In this case, it is also\npossible to import the compiled library from a different directory. Assume\nthat the library is compiled but not installed, then the file structure may\nlook as follows:\n\n~~~\n  ...\n   │\n   ├── lib/\n   │    ├── lz4ada.adb\n   │    ├── lz4ada.ads\n   │    ├── lz4ada.ali\n   │    ├── build.xml\n   │    └── liblz4ada.so\n   │\n   ├── tool_unlz4ada/\n   │    ├── unlz4ada.adb\n   │    └── build.xml\n  ...\n~~~\n\nCompilation and invocation then have to account for the library not being\ninstalled as follows:\n\n\tgnatmake -o unlz4ada unlz4ada.adb -aO../lib -aI../lib -largs -llz4ada\n\tulimit -s 60000\n\tLD_LIBRARY_PATH=$PWD/../lib ./unlz4ada \u003c ../test_vectors_lz4/z1.lz4 | xxd\n\nDecompression API\n=================\n\nThis section describes the decompression API provided by this library. There is\nalso an API to make use of the XXHash32 checksum directly, see `XXHash32 API`\nfurther down. Some of the data types are shared among both of the APIs and\nonly described here.\n\n## Types\n\n~~~{.ada}\nsubtype U8  is Interfaces.Unsigned_8;\nsubtype U32 is Interfaces.Unsigned_32;\nsubtype U64 is Interfaces.Unsigned_64;\ntype Octets is array (Integer range \u003c\u003e) of U8;\ntype End_Of_Frame is (Yes, No, Maybe);\ntype Flexible_Memory_Reservation is (SZ_64_KiB, SZ_256_KiB, SZ_1_MiB,\n\t\t\t\tSZ_4_MiB, SZ_8_MiB, Use_First, Single_Frame);\nsubtype Memory_Reservation is Flexible_Memory_Reservation range\n\t\t\t\t\t\t\tSZ_64_KiB .. SZ_8_MiB;\nFor_Modern: constant Memory_Reservation := SZ_4_MiB;\nFor_All:    constant Memory_Reservation := SZ_8_MiB;\ntype End_Of_Frame is (Yes, No, Maybe);\ntype Decompressor(In_Last: Integer) is tagged limited private;\n~~~\n\n### `U8`, `U32`, `Octets`, `Decompressor`\n\n * `U8`: This type represents a single byte.\n * `U32`: This type represents a 32 bit word.\n * `Octets`: This type represents a byte string.\n   As an alternative to the `Octets` type defined by this library, the standard\n   `Stream_Element_Array` type can be used. In this case it is assumed that the\n   stream elements are indeed bytes (this is not required by the standard!).\n * `Decompressor`: This type represents an opaque context of operation.\n\n### `Memory_Reservation`\n\nA memory reservation is used to limit how much stack space the library allocates\nfor processing data. On “large” machines and to be able to process all kinds\nof LZ4 frames the default setting of 8 MiB (`For_All` constant) may be a good\nchoice.\n\nIf you are hitting memory limit issues but want to keep things simple, the\n`For_Modern` constant corresponding to 4 MiB buffers already halves the amount\nof memory needed only by sacrificing compatibility with legacy frames.\n\nFor cases where strict limits are needed, any of the `SZ_` enum values can be\nused as memory reservation. If input data is too large to be processed using the\nallocated buffers, exception `Too_Little_Memory` is raised to clearly indicate\nthis to the API user.\n\n### `Flexible_Memory_Reservation`\n\nWhile a constant value memory limit like `SZ_64_KiB` may be a good choice for\nsome (larger) embedded systems, there are also cases where the memory allocation\nshould follow the size of the input data. For these cases, a\n`Flexible_Memory_Reservation` can be used in conjunction with API function\n`Init_With_Header`. The following flexible memory reservations are available:\n\n * `Use_First`: Allocate memory according to the size of the first frame\n   processed.\n * `Single_Frame`: Like `Use_First` but additionally raises `Data_Corruption`\n   when data from concatenated frames is provided. This is useful in case the\n   API user wants to perform their own multi-frame handling.\n\nAs `Memory_Reservation` is a subtype of `Flexible_Memory_Reservation`, any\nconstant `Memory_Reservation` can also be passed to `Init_With_Header`.\n\n### `End_Of_Frame`\n\nThis type represents the state of processing. It is a tristate value\n`Yes/No/Maybe` with the following meaning:\n\n * `Yes` indicates that the frame has ended.\n * `No` indicates that a frame is being processed and the current state cannot\n   be the end yet.\n * `Maybe` indicates a special case: When a legacy frame is processed, the end\n   of data may occur after any full block. `Maybe` indicates such an end of\n   block that could be the end of the stream already. An application using the\n   library should use external knowledge about the end of input data to decide\n   whether this `Maybe` is an actual end of stream or just a temporary state\n   that occurs before the next block begins. If an application intends to parse\n   only modern frames, it could as well raise an exception upon reaching this\n   `Maybe` state.\n\n## Exceptions\n\nThe following UML-like diagram shows an overview about the exceptions provided\nby this library:\n\n~~~\n                   ┌─────────────────────────────────┐\n                   │ (cannot process the given data) │\n                   └──△────────────────────────────△─┘\n                      │                            │\n          ┌───────────┴─────┐                 ┌────┴────────────────────┐\n          │ Data_Corruption │                 │ (library usage related) │\n          └──△───────────△──┘                 └────△───────────────△────┘\n             │           │                         │               │\n┌────────────┴───┐ ┌─────┴─────────┐ ┌─────────────┴────────┐ ┌────┴──────────────┐\n│ Checksum_Error │ │ Not_Supported │ │ Too_Few_Header_Bytes │ │ Too_Little_Memory │\n└────────────────┘ └───────────────┘ └──────────────────────┘ └───────────────────┘\n~~~\n\nThe labels in parentheses serve as additional information and do not correspond\nto exceptions in the library.\n\n### Data Corrution related Exceptions\n\n * `Data_Corruption`:\n   This exception is raised whenever internal assumptions of the LZ4\n   frame or block format are violated. It indicates non-LZ4 or corrupted\n   input data Additionally, this exception is raised when ` Single_Frame`\n   operation was requsted, but data for a follow-up frame is detected by the\n   library.\n * `Checksum_Error`:\n   This exception is raised whenever an LZ4 checksum does not match.\n   The library does not currently support bypassing the checksum verification.\n * `Not_Supported`:\n   This exception is raised whenever values observed that the LZ4\n   specification reports as reserved. As such, the values could indicate\n   newer data formats/features being in use. As this need not be\n   corrupted data but could be a valid new extension of the format,\n   a dedicated `Not_Supported` exception is raised in this case.\n\n### Library Usage related Exceptions\n\nThese exceptions occur because certain restrictions (e.g. `Memory_Reservation`)\nare passed to the API in a way that contradicts what the actual supplied data\nrequires. If an application declares reservations in a way that they “should”\nbe OK, then any of these exceptions can be treated like a `Data_Corruption` or\n`Not_Supported`.\n\n * `Too_Few_Header_Bytes`:\n   This exception is only raised by function `Init_With_Header` and reports that\n   the provided header data input is too short to contain the entire LZ4 header.\n   Applications can use this to fall-back to the `Init` without header or be\n   changed to provide a larger chunk of initial data to `Init_With_Header`.\n * `Too_Little_Memory`:\n   This exception is raised when an LZ4 frame header is processed and it\n   indicates a maximum block size that is larger than the buffer size provided\n   by the current `Decompressor` context. This means the data may be valid but\n   cannot be processed using the current contexts. API users encountering this\n   exception should consider using a larger memory reservation in `Init`.\n\n## API Rationale\n\nThis section describes some of the thoughts behind the API design. They may help\nAPI users understand the overall idea behind the API better and are less focused\non the API usage.\n\n### `Min_Buffer_Size` Requirement\n\nThe minimum output buffer size is at least a single block size. If the output\nbuffer were possible to be chosen even smaller, internal computation would be\nmuch more complicated since it would be necessary to pause and resume output\nmid-buffer. Allowing the routines to assume that there is enough space for at\nleast _one_ block, makes the handling less complicated without impacting\nperformance.\n\n### About `Buffer` and `Num_Consumed`\n\nA small input may lead to a large output. To avoid using unbounded memory\namounts one must limit the output buffer size. The library implementation\nensures this by passing a target buffer as an `in out` parameter rather than a\nreturn value. A supplied input's decompressed size may exceed the output buffer\ncapacity. In order to allow output for the remainder of the input to be\ngenerated, it may become necessary to supply part of the same input again. This\nis achieved by signalling the number of consumed bytes back to the caller\nas `Num_Consumed`.\n\nIn addition to the output data block, the buffer also contains history\ninformation that is used for “backreferences” during decompressing. Storing the\nhistory information in the same buffer requires it to be an `in out` parameter\nthat must not be changed between invocations of the `Update` procedure. While\nthis slightly undercuts encapsulation, it can have a notable performance impact:\nAs all data from the history has been an output at some time, it makes sense to\ntry storing it only once i.e. as output without keeping a separate copy as\n“history”. As a result, fewer copy operations are necessary, yielding better\noverall performance.\n\n### No `Final`\n\nThere is no need for a “Final” procedure: For non-legacy frames, LZ4 clearly\nindicates when processing has reached the end of the frame. For legacy frames,\nthe library reports the end of block as `Maybe` end of frame.\n\nThis makes `Final` sort of an optional check to see if the end of input data\ncorrelates whith LZ4's understanding of the end of frame. The recommended way\nto perform this check in applications that include support for legacy frames\nis as follows:\n\n~~~{.ada}\nif LZ4Ada.\"=\"(Ctx.Is_End_Of_Frame, LZ4Ada.No) then\n\traise Constraint_Error with \"Input ended mid-frame.\";\nend if;\n~~~\n\nAs an alternative to `Constraint_Error`, other exception types might be\nsuitable, e.g. the library-supplied `Data_Corruption` is also a sensible choice\nhere.\n\n### Multi-Frame Handling\n\nThe library supports a “simple” usage where it internally detects multiple\nconcatenated frames and processes them in sequence. This is the behaviour of the\ncommandline `unlz4` command and probably well-suited for a wide range of\nLZ4 decompression problems.\n\nAs LZ4 frames can contain additional information in “skipppable frames” it\nalso makes sense to provide a manes for the application to detect the end of\nframes and probably process some special variants by itself. To enable this,\nthe `Init_With_Header` API can be supplied a memory reservation `Single_Frame`\nto enable single-frame mode of operation. Then, only one LZ4 frame is processed\nusing the context and data beyond that can be processed differently by the API\nuser or a new context can be allocated for the next frame. This mode can also be\nused to limit the memory allocation to what is needed for the specific,\ncurrently processed frame.\n\nAs the use of this API is more complicated (compare the example under\n`tool_unlz4ada`), the more conveniently usable `Init` (without header) API is\nprovided, too.\n\n## Functions and Procedures\n\nA detailed description of the API functions follows after the overview excerpt\nfrom `lz4ada.ads`.\n\n~~~{.ada}\nfunction Init(Min_Buffer_Size:   out    Stream_Element_Offset;\n                Reservation:     in     Memory_Reservation := For_All)\n                return Decompressor;\nfunction Init(Min_Buffer_Size:   out    Integer;\n                Reservation:     in     Memory_Reservation := For_All)\n                return Decompressor;\n\nfunction Init_With_Header(Input: in     Octets;\n                Num_Consumed:    out    Integer;\n                Min_Buffer_Size: out    Integer;\n                Reservation:     in     Flexible_Memory_Reservation\n                                                        := Single_Frame)\n                return Decompressor with Pre =\u003e Input'Length \u003e= 7;\n\nfunction Init_For_Block(Min_Buffer_Size:   out Integer;\n\t\t\tCompressed_Length: in  Integer;\n\t\t\tReservation:       in  Memory_Reservation := For_All)\n\t\treturn Decompressor;\n\nprocedure Update(Ctx:            in out Decompressor;\n                Input:           in     Stream_Element_Array;\n                Num_Consumed:    out    Stream_Element_Offset;\n                Buffer:          in out Stream_Element_Array;\n                Output_First:    out    Stream_Element_Offset;\n                Output_Last:     out    Stream_Element_Offset);\nprocedure Update(Ctx:            in out Decompressor;\n                Input:           in     Octets;\n                Num_Consumed:    out    Integer;\n                Buffer:          in out Octets;\n                Output_First:    out    Integer;\n                Output_Last:     out    Integer)\n                with Pre =\u003e (Buffer'First = 0);\n\nfunction Is_End_Of_Frame(Ctx: in Decompressor) return End_Of_Frame;\n~~~\n\n### `function Init(Min_Buffer_Size: out; Reservation: in) return Decompressor`\n\nThis function initializes a new Decompressor without having to provide initial\nheader.\n\nThis Decompressor accepts multiple concatenated frames in sequence such as long\nas they fit the `Memory_Reservation`. For the meaning of the different memory\nreservations, check the documentation for the `Memory_Reservation` data type\nabove.\n\nThis is a convenient API for cases where either memory consumption does not\nmatter much (e. g. on Desktop OSes) or where the upper bound for the maximum\nblock size that is going to be used is known in advance.\n\nThis function can be called with either `Integer` as number type or\n`Stream_Element_Offset` to directly allow passing the `Min_Buffer_Size` to\nan array declaration.\n\n### `function Init_With_Header(Input: in; Num_Consumed: out; Min_Buffer_Size: out; Reservation: in)`\n\nThis function creates an LZ4 decompression context. It requires the begin of\nthe frame to be supplied as `Input` and returns the decompression context.\n\nAdditionally, it outputs as `Num_Consumed` how many of the supplied bytes it\nprocessed and as `Min_Buffer_Size` the suggested buffer size to be used for\n`Update` calls. The bytes consumed by `Init` must not be sent to subsequent\n`Update` calls, i. e. if there is still some data left in the input buffer, then\nthe first `Update` call is expected to take `Input(Num_Consumed .. Input'Last)`\nrather than `Input` directly.\n\nThis function can only be called using `Octets` as data type and `Integer` as\nnumber type because it is a low-level API that is expected to be only needed\nin rare cases.\n\nIf too little input data is supplied to process the entire header,\nexception `Too_Few_Header_Bytes` is raised.\n\n### `function Init_For_Block(Min_Buffer_Size: out; Compressed_Length: in; Reservation: in)`\n\nThis function creates an LZ4 decompression context intended to decompress just\na single block of compressed data of compressed length `Compressed_Length`. This\ncan be used to decompress the raw LZ4 block format.\n\nIt is advisable to prefer the frame format over the block format since it allows\nstoring useful metadata and using a single decopressor to decompress data of\narbitrary length.\n\n### `procedure Update(Ctx: in out; Input: in, Num_Consumed: out, Buffer: in out; Output_First: out; Output_Last: out)`\n\nThis procedure can be called to decompress data.\n\n`Ctx` and `Buffer` together form the context that is expected to be provided\neach time data from the same LZ4 stream is to be decompressed.\n\n`Input` must always point to previously unprocessed data. Check the value of\n`Num_Consumed` after each invocation to find out how many of the input bytes\nmust be skipped (e. g. by using slice notation) when invoking `Update` again on\nthe same `Input` buffer.\n\n`Buffer` must not be modified between invocations of `Update` since it is used\nto hold “history” information about previously produced output that is integral\nto the decompression process.\n\n`Output_First` marks the index of the first octet of the decompressed data in\nthe `Buffer` (inclusive).\n\n`Output_Last` marks the index of the last octet of the decompressed data in the\noutput buffer (inclusive).\n\nIf no output was generated, then `Output_Last` is smaller than `Output_First`.\n\nThe procedure exists in two variants: One with `Octets` and `Integer` and one\nwith `Stream_Element_Array` and `Stream_Element_Offset` types for ease of\nintegration with the Standard Stream APIs.\n\nWhen used in the LZ4 `Block` mode and given the entire block input data, this\nprocedure is guaranteed to produce the entire output for that block in a single\nrun.\n\n### `function Is_End_Of_Frame(Ctx: in) return End_Of_Frame;`\n\nThis function returns the “end-of-frame” state of the decompressor. See the\n`End_Of_Frame` type's description for the meanings of the values.\n\nApplications are expected to check this value, but depending on implementation\nand intended formats to support, this check can happen at different times:\n\n * One variant is to check the value at the end of processing only\n   (see example code `unlz4ada_simple.adb`). If end of frame is not checked\n   explicitly, it may be reported by a `Data_Corruption` exception in case an\n   implementation which only expects to process a single LZ4 frame is passed\n   data consisting of multiple frames.\n * Alternatively, the value can be checked for each invocation of `Update`\n   for cases where the application intends to decode multiple, consecutive\n   LZ4 frames by itself. An example of this variant can be found in file\n   `test_unlz4ada/unlz4ada.adb`.\n * Some of the API uses may even work without calling the function at all. This\n   is a viable option if the decompressed data's validity is checked externally\n   (e.g. by means of a cryptographic hash function) and there might thus be no\n   need to find out if decompression was completed at the time of decompression\n   already.\n\nXXHash32 API\n============\n\nThe subpackage `LZ4Ada.XXHash32` provides access to the `XXHash32` hash\nfunction which is used internally by LZ4 but might be interesting to be used\nin other contexts, too. Since I expect such use cases to be niche, no API using\nthe `Stream_Element_Array` types is provided here.\n\n~~~{.ada}\npackage XXHash32 is\n\ttype Hasher is tagged limited private;\n\tfunction  Hash(Input: in Octets) return U32;\n\tfunction  Init(Seed: in U32 := 0) return Hasher;\n\tprocedure Reset(Ctx: in out Hasher; Seed: in U32 := 0);\n\tprocedure Update(Ctx: in out Hasher; Input: in Octets);\n\tfunction  Final(Ctx: in Hasher) return U32;\nprivate\n\t-- ...\nend XXHash32;\n~~~\n\nType `Hasher` represents the internal computation context of the hash function,\nsuitable for computing a single `XXHash32`. There is currently no API to\nreset the hasher after use, hence API users are suggested to create new `Hasher`\ncontexts whenever they need to compute the hash of different data.\n\n## Functions and Procedures\n\nThe hashing API functions closely resemble the standard pattern of\n`Init/Update/Final` with an additional convenience function that can be used\nto perform the entire computation in a single step:\n\n### `function Hash(Input: in) return U32;`\n\nAs a convenient means to compute the hash over input data without having to\ncall any of the `Init/Update/Final` routines, function `Hash` can be used to\ncompute the hash over the given `Input` data in a single step.\n\nIn case the input data is large, it might be better to design using the\n`Init/Update/Final` set of functions since those allow processing arbitrarily\nlong data whereas `Hash` expects all of the input to be present in memory\nat once.\n\n### `function Init(Seed: in) return Hasher;`\n\nThis function creates and returns a `Hasher` instance using the supplied seed\nvalue (default: 0).\n\n### `procedure Reset(Ctx: in out; Seed: in);`\n\nResets a `Hasher` to the state just like an `Init`. This allows re-creating a\nhasher despite it being a limited record.\n\n### `procedure Update(Ctx: in out; Input: in);`\n\nUse this procedure to supply the data that is considered input into the hashing\nfunction.\n\n### `function Final(Ctx: in) return U32;`\n\nThis function outputs the 32-bit hash corresponding to the concatenation of all\ndata supplied with `Update` to the provided `Hasher` context and returns it,\n\nIt is possible to call `Update` on the same context again afterwards to append\ninput data and then invoke `Final` again to obtain the hash of the concatenation\nof all preceding plus the newly added input data.\n\nPerformance\n===========\n\nOn my system (Intel Xeon W-2295, inside a test VM), the following decompression\nspeeds are observed when running the `test_benchmark.sh` script:\n\n\t$ ./test_benchmark.sh -h\n\tbenchmark zeroes\n\tBenchmark 1: ./tool_unlz4ada/unlz4ada \u003c /tmp/zeroes.lz4\n\t  Time (mean ± σ):     978.1 ms ±  37.7 ms    [User: 971.0 ms, System: 6.7 ms]\n\t  Range (min … max):   939.5 ms … 1143.0 ms    50 runs\n\t\n\tbenchmark reference zeroes\n\tBenchmark 1: unlz4 \u003c /tmp/zeroes.lz4\n\t  Time (mean ± σ):     739.3 ms ±  23.9 ms    [User: 723.5 ms, System: 15.5 ms]\n\t  Range (min … max):   696.5 ms … 799.2 ms    50 runs\n\t \n\tbenchmark random\n\tBenchmark 1: ./tool_unlz4ada/unlz4ada \u003c /tmp/random.lz4\n\t  Time (mean ± σ):      1.847 s ±  0.038 s    [User: 1.505 s, System: 0.342 s]\n\t  Range (min … max):    1.788 s …  1.961 s    50 runs\n\t \n\tbenchmark reference random\n\tBenchmark 1: unlz4 \u003c /tmp/random.lz4\n\t  Time (mean ± σ):     649.2 ms ±  19.9 ms    [User: 378.6 ms, System: 270.2 ms]\n\t  Range (min … max):   616.1 ms … 701.8 ms    50 runs\n\t \n\tbenchmark text\n\tBenchmark 1: ./tool_unlz4ada/unlz4ada \u003c /tmp/text.lz4\n\t  Time (mean ± σ):      1.863 s ±  0.043 s    [User: 1.523 s, System: 0.339 s]\n\t  Range (min … max):    1.791 s …  1.994 s    50 runs\n\t \n\tbenchmark reference text\n\tBenchmark 1: unlz4 \u003c /tmp/text.lz4\n\t  Time (mean ± σ):     644.6 ms ±  20.2 ms    [User: 372.1 ms, System: 272.2 ms]\n\t  Range (min … max):   604.5 ms … 689.2 ms    50 runs\n\t\n\t$ ./test_benchmark.sh\n\tbenchmark zeroes\n\t0+32768 records in\n\t0+32768 records out\n\t2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.4438 s, 1.5 GB/s\n\t\n\tbenchmark reference zeroes\n\t0+32897 records in\n\t0+32897 records out\n\t2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.52586 s, 1.4 GB/s\n\t\n\tbenchmark random\n\t0+32768 records in\n\t0+32768 records out\n\t2147483648 bytes (2.1 GB, 2.0 GiB) copied, 3.07091 s, 699 MB/s\n\t\n\tbenchmark reference random\n\t0+65303 records in\n\t0+65303 records out\n\t2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.02479 s, 2.1 GB/s\n\t\n\tbenchmark text\n\t0+32768 records in\n\t0+32768 records out\n\t2147483648 bytes (2.1 GB, 2.0 GiB) copied, 3.20152 s, 671 MB/s\n\t\n\tbenchmark reference text\n\t0+33049 records in\n\t0+33049 records out\n\t2147483648 bytes (2.1 GB, 2.0 GiB) copied, 0.973975 s, 2.2 GB/s\n\nComputing the speed from the measures yields the following results:\n\nCase    Ref/Ada  DD [MiB/s]  Avg [MiB/s]  Min/6σ [MiB/s]  Max/6σ [MiB/s]  PercOfRef [%]\n------  -------  ----------  -----------  --------------  --------------  -------------\nZero    Ada      1418        2094         1701            2723            76\nZero    Ref.     1342        2770         2320            3437            106\nRandom  Ada      667         1109         987             1265            35\nRandom  Ref.     1998        3155         2665            3866            33\nText    Ada      639         1099         966             1276            30\nText    Ref.     2103        3177         2674            3913            35\n\nRaw Computation\n\n\t“Zero” Row\n\t2048/(1.4438);\n\t2048/(0.9781);\n\t2048/(0.9781+6*0.0377);\n\t2048/(0.9781-6*0.0377);\n\t(1/0.9781)/(1/0.7393);\n\t2048/(1.52586);\n\t2048/(0.7393);\n\t2048/(0.7393+6*0.0239);\n\t2048/(0.7393-6*0.0239);\n\t(1/1.4438)/(1/1.52586);\n\t\n\t“Random” Row\n\t2048/(3.07091)\n\t2048/(1.847);\n\t2048/(1.847+6*0.038);\n\t2048/(1.847-6*0.038);\n\t(1/1.847)/(1/0.6492);\n\t2048/(1.02479);\n\t2048/(0.6492);\n\t2048/(0.6492+6*0.0199);\n\t2048/(0.6492-6*0.0199);\n\t(1/3.07091)/(1/1.02479);\n\t\n\t“Text” Row\n\t2048/(3.20152);\n\t2048/(1.863);\n\t2048/(1.863+6*0.043);\n\t2048/(1.863-6*0.043);\n\t(1/1.863)/(1/0.6446);\n\t2048/(0.973975);\n\t2048/(0.6446);\n\t2048/(0.6446+6*0.0202);\n\t2048/(0.6446-6*0.0202);\n\t(1/3.20152)/(1/0.973975);\n\nShort Summary: The Ada implementation seems to attain about one third of the\nspeed of the Debian-supplied `unlz4` command. In absolute figures this is\nstill around 1000 MiB/s (for the bad cases) which can be expected to be enough\nfor plenty of use cases. Measuring with `dd` consistently yields smaller\nthroughputs compared to the `hyperfine` approach. This could be explained by the\nfact that `dd` needs to perform a copy whereas `hyperfine` just discards the\nextracted output right away.\n\n## Performance vs. Safety -- Use of `pragma Suppress`\n\nDuring optimization, some areas in the library that are performance crtical\nturned out to be hugely slowed down by compiler-generated length and overflow\nchecks. In order to balance safety and performance, some checks are currently\ndisabled for the `Write_Output` procedure in the library. All other checks were\nleft in place.\n\nIf you need maximum safety even accepting strong performance penalties, feel\nfree to comment-out the following `pragma` directives in procedure\n`Write_Output` in file `lz4ada.adb`:\n\n~~~{.ada}\npragma Suppress(Length_Check);\npragma Suppress(Overflow_Check);\npragma Suppress(Index_Check);\npragma Suppress(Range_Check);\n~~~\n\nRationale and Usage Recommendation\n==================================\n\nThis library was created out of the need to process data from a Rust program\nthat encodes data in LZ4. Of course, if decoding is enough for your purposes,\nyou may as well use this library in new designs. I encourage you to consider\nalternatively choosing a more widely supported compression format such as e.g.\nLZMA. This way, better-maintained implementations and also the compression side\nmight be available in Ada.\n\nInteresting Links to alternative Compression/Decompression implementations in\nAda:\n\n * \u003chttps://unzip-ada.sourceforge.io/\u003e -\n   implements various compressors and decompressors natively in Ada.\n * \u003chttps://packages.debian.org/bullseye/libgnatcoll-lzma2\u003e --\n   an LZMA bindings library that is available in Debian.\n\nChanges\n=======\n\nFeel free to send patches with bugfixes or missing functionality directly to\n\u003cinfo@masysma.net\u003e. Include a note to confirm that you are OK with these\npatches being included under Expat license and add your preferred copyright line\nto the patch or e-mail.\n\nPlease note that API breaks are only accepted if _very strong reasons_ exist to\nmotivate them.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fm7a%2Fbo-lz4-ada","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fm7a%2Fbo-lz4-ada","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fm7a%2Fbo-lz4-ada/lists"}