{"id":13726115,"url":"https://github.com/mirage/decompress","last_synced_at":"2025-04-05T04:12:02.370Z","repository":{"id":31167685,"uuid":"34727873","full_name":"mirage/decompress","owner":"mirage","description":"Pure OCaml implementation of Zlib.","archived":false,"fork":false,"pushed_at":"2025-01-08T11:37:40.000Z","size":4946,"stargazers_count":117,"open_issues_count":8,"forks_count":21,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-03-29T03:08:13.007Z","etag":null,"topics":["compression","decompression","deflate","huffman","inflate","lz77","ocaml","zlib"],"latest_commit_sha":null,"homepage":"","language":"OCaml","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mirage.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.md","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-04-28T11:59:08.000Z","updated_at":"2025-03-11T03:35:59.000Z","dependencies_parsed_at":"2024-06-19T02:51:41.625Z","dependency_job_id":"4a1f1047-619c-4b9a-897f-b1ada64c41f1","html_url":"https://github.com/mirage/decompress","commit_stats":null,"previous_names":[],"tags_count":23,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mirage%2Fdecompress","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mirage%2Fdecompress/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mirage%2Fdecompress/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mirage%2Fdecompress/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mirage","download_url":"https://codeload.github.com/mirage/decompress/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247284951,"owners_count":20913704,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compression","decompression","deflate","huffman","inflate","lz77","ocaml","zlib"],"created_at":"2024-08-03T01:02:52.974Z","updated_at":"2025-04-05T04:12:02.352Z","avatar_url":"https://github.com/mirage.png","language":"OCaml","readme":"# Decompress - Pure OCaml implementation of decompression algorithms\n\n`decompress` is a library which implements:\n- [RFC1951](https://tools.ietf.org/html/rfc1951)\n- [Zlib](https://zlib.net/)\n- [Gzip](https://tools.ietf.org/html/rfc1952)\n- [LZO](https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Oberhumer)\n\n## The library\n\nThe library is available with:\n```\n$ opam install decompress\n```\n\nIt provides three sub-packages:\n- `decompress.de` to handle RFC1951 stream\n- `decompress.zl` to handle Zlib stream\n- `decompress.gz` to handle Gzip stream\n- `decompress.lzo` to handle LZO contents\n\nEach sub-package provide 3 sub-modules:\n- `Inf` to inflate/decompress a stream\n- `Def` to deflate/compress a stream\n- `Higher` as a easy entry point to use the stream\n\n## How to use it\n\n### The binary\n\nThe distribution provides a simple binary which is able to compress/uncompress\nanything:\n```sh\n$ decompress -fgzip --deflate \u003c my_document.txt \u003e my_document.gzip\n$ decompress -fgzip \u003c my_document.gzip \u003e my_document.out\n$ diff my_document.txt my_document.out\n```\n\nIt does the GZip compression, the Zlib one and the DEFLATE one. It can do an\nLZO compression too.\n\n### Link issue\n\n`decompress` uses [`checkseum`][checkseum] to compute CRC of streams.\n`checkseum` provides 2 implementations:\n- a C implementation to be fast\n- an OCaml implementation to be usable with `js_of_ocaml` (or, at least,\n  require only the _caml runtime_)\n\nWhen the user wants to make an OCaml executable, it must choose which\nimplementation of `checkseum` he wants. A compilation of an executable with\n`decompress.zl` is:\n```\n$ ocamlfind opt -linkpkg -package checkseum.c,decompress.zl main.ml\n```\n\nOtherwise, the end-user should have a linking error (see\n[#47](https://github.com/mirage/decompress/issues/47)).\n\n#### With `dune`\n\n`checkseum` uses a mechanism integrated into `dune` which solves the link\nissue. It provides a way to silently choose the default implementation of\n`checkseum`: `checkseum.c`.\n\nBy this way (and only with `dune`), an executable with `decompress.zl` is:\n```\n(executable\n (name main)\n (libraries decompress.zl))\n```\n\nOf course, the user still is able to choose which implementation he wants:\n```\n(executable\n (name main)\n (libraries checkseum.ocaml decompress.zl))\n```\n\n### The API\n\n`decompress` proposes to the user a full control of:\n- the input/output loop\n- the allocation\n\n#### Input / Output\n\nThe process of the inflation/deflation is non-blocking and it does not require\nany _syscalls_ (as an usual MirageOS project). The user can decide how to get\nthe input and how to store the output.\n\nAn usual _loop_ (which can fit into `lwt` or `async`) of `decompress.zl` is:\n```ocaml\nlet rec go decoder = match Zl.Inf.decode decoder with\n  | `Await decoder -\u003e\n    let len = input itmp 0 (Bigstringaf.length tmp) in\n    go (Zl.Inf.src decoder itmp 0 len)\n  | `Flush decoder -\u003e\n    let len = Bigstringaf.length otmp - Zl.Inf.dst_rem decoder in\n    output stdout otmp 0 len ;\n    go (Zl.Inf.flush decoder)\n  | `Malformed err -\u003e invalid_arg err\n  | `End decoder -\u003e\n    let len = Bigstringaf.length otmp - Zl.Inf.dst_rem decoder in\n    output stdout otmp 0 len in\ngo decoder\n```\n\n#### Allocation\n\nThen, the process does not allocate large objects but it requires at the\ninitialisation these objects. Such objects can be re-used by another\ninflation/deflation process - of course, these processes can not use same\nobjects at the same time.\n\n```ocaml\nval decompress : window:De.window -\u003e in_channel -\u003e out_channel -\u003e unit\n\nlet w0 = De.make_windows ~bits:15\n\n(* Safe use of decompress *)\nlet () =\n  decompress ~window:w0 stdin stdout ;\n  decompress ~window:w0 (open_in \"file.z\") (open_out \"file\")\n\n(* Unsafe use of decompress,\n   the second process must use an other pre-allocated window. *)\nlet () =\n  Lwt_main.run @@\n    Lwt.join [ (decompress ~window:w0 stdin stdout |\u003e Lwt.return)\n             ; (decompress ~window:w0 (open_in \"file.z\") (open_out \"file\")\n\t       |\u003e Lwt.return) ]\n```\n\nThis ability can be used on:\n- the input buffer given to the encoder/decoder with `src`\n- the output buffer given to the encoder/decoder\n- the window given to the encoder/decoder\n- the shared-queue used by the compression algorithm and the encoder\n\n### Example\n\nAn example exists into [bin/decompress.ml][decompress.ml] where you can see how\nto use `decompress.zl` and `decompress.de`.\n\n### Higher interface\n\nHowever, `decompress` provides a _higher_ interface close to what `camlzip`\nprovides to help newcomers to use `decompress`:\n```ocaml\nval compress :\n     refill:(bigstring -\u003e int)\n  -\u003e flush:(bigstring -\u003e int -\u003e unit)\n  -\u003e unit\nval uncompress :\n     refill:(bigstring -\u003e int)\n  -\u003e flush:(bigstring -\u003e int -\u003e unit)\n  -\u003e unit\n```\n\n### Benchmark\n\n`decompress` has a benchmark about _inflation_ to see if any update has a\nperformance implication. The process try to _inflate_ a stream and stop at N\nsecond(s) (default is 30), The benchmark requires `libzlib-dev`, `cmdliner` and\n`bos` to be able to compile `zpipe` and the executable to produce the CSV file.\nTo build the benchmark:\n\n```sh\n$ dune build --profile benchmark bench/output.csv\n```\n\nOn linux machines, `/dev/urandom` will generate the random input for piping to\nzpipe. To run the benchmark:\n```sh\n$ cat /dev/urandom | ./_build/default/bench/zpipe \\\n  | ./_build/default/bench/bench.exe 2\u003e /dev/null\n```\n\nThe output file is a CSV file which can be processed by a _plot_ software. It\nrecords input bytes, output bytes and memory usage at each second. You can\nshow results with `gnuplot`:\n```sh\n$ gnuplot -p -e \\\n  'set datafile separator \",\";\n   set key autotitle columnhead;\n   plot \"_build/default/bench/output.csv\" using 1:2 with lines,\n        \"\" using 1:3 with lines'\n$ gnuplot -p -e \\\n  'set datafile separator \",\";\n   set key autotitle columnhead;\n   plot \"_build/default/bench/output.csv\" using 1:4 with lines'\n```\n\nThe second graph ensure that the inflation does not allocate while it\nprocesses. It ensure that, at another layer, `decompress` does not leak\nmemory.\n\n## Build Requirements\n\n * OCaml \u003e= 4.07.0\n * `dune` to build the project\n * `base-bytes` meta-package\n * `checkseum`\n * `optint`\n\n[checkseum]: https://github.com/mirage/checkseum\n[decompress.ml]: ./bin/decompress.ml\n","funding_links":[],"categories":["Algorithms and Data Structures","OCaml"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmirage%2Fdecompress","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmirage%2Fdecompress","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmirage%2Fdecompress/lists"}