{"id":16355503,"url":"https://github.com/waveform80/compression","last_synced_at":"2026-06-24T02:31:35.405Z","repository":{"id":142509220,"uuid":"436415243","full_name":"waveform80/compression","owner":"waveform80","description":"A bit of compression bikeshedding","archived":false,"fork":false,"pushed_at":"2022-03-09T13:25:22.000Z","size":2731,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-03-14T10:40:41.785Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/waveform80.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-12-08T22:52:06.000Z","updated_at":"2022-01-10T16:42:29.000Z","dependencies_parsed_at":null,"dependency_job_id":"40ed10d5-c838-4ea4-a912-4fd01b2823b4","html_url":"https://github.com/waveform80/compression","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/waveform80/compression","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waveform80%2Fcompression","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waveform80%2Fcompression/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waveform80%2Fcompression/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waveform80%2Fcompression/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/waveform80","download_url":"https://codeload.github.com/waveform80/compression/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waveform80%2Fcompression/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34714992,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-24T02:00:07.484Z","response_time":106,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-11T01:41:01.115Z","updated_at":"2026-06-24T02:31:35.388Z","avatar_url":"https://github.com/waveform80.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"====================\nCompression Analysis\n====================\n\nAn analysis of various compressors with a variety of options across several\narchitectures and machine sizes.\n\n\nRequirements\n============\n\nThe following should be sufficient to install the pre-requisites for reading\n(and playing with) the analysis::\n\n    $ sudo apt install python3-pip python3-matplotlib python3-docutils jupyter-notebook\n    $ pip3 install --user ipympl\n    $ jupyter notebook\n\nIn the browser window that opens, select ``analysis.ipynb`` and then select\n\"Cell\" and \"Run All\" from the menu. I'd recommend skipping over the code\nsections unless you're particularly interested in the queries themselves; the\nprose and the results are the important bits.\n\n\nData Gathering\n==============\n\nIf you wish to gather additional data for more platforms, you will need the\nfollowing packages installed:\n\n* python3\n* lz4\n* xz-utils\n* gzip\n* pigz\n* zstd\n* lzip\n* plzip\n* lbzip2\n\nWe are particularly interested in the compression of an initramfs CPIO archive,\nthe compression ratio achieved, the time taken, and the maximum resident memory\nused as the current default compression scheme used in Ubuntu is zstd with -19\nwhich is not only extremely slow (even on large scale machines like an AMD\nRyzen) but also takes an amount of memory that results in OOM crashes on\nsmaller machines (e.g. a Pi Zero 2 or 3A+ which only has 512MB of RAM).\n\nThe ``gather.py`` script was used to measure the aforementioned parameters. The\ntypical method of execution (on a fully updated Jammy image) was to extract the\ncurrent ``initrd.img`` archive from the boot partition, and run the\n``gather.py`` script with a suitable machine label.\n\nExtracting the ``initrd.img`` is relatively trivial on *most* platforms, but\non ``amd64`` some care must be taken as there's typically an (uncompressed)\nearly initrd with processor microcode at the start. The following method is\nrecommended::\n\n    $ git clone https://github.com/waveform80/compression\n    $ cd compression\n    $ unmkinitramfs /boot/initrd.img-$(uname -r) initrd/\n    $ pushd initrd; find | cpio -o -H newc \u003e ../initrd.cpio; popd\n    $ rm -fr initrd/\n    $ ./gather.py initrd.cpio --machine \"My Machine with 16GB RAM\"\n\nProvide some appropriate description with the ``--machine`` switch. Before the\nrun begins, the script also checks that all compressors to be tested are\nexecutable and will prompt you to install any that are missing (you may need to\ninstall ``lz4``, ``lzip``, ``pigz``, and ``plzip`` as they are not currently\nseeded).\n\nThe script is sufficiently intelligent not to re-run tests that already exist\nin the database for the specified machine label. This helps dealing with the\nsmaller machines that had a tendency to crash entirely when pushed to their\nlimits.\n\n\nDatabase Structure\n==================\n\nThe script creates (or updates) the ``compression.db`` SQLite database which\nhas the following schema:\n\n\ntests\n-----\n\nThis table stores the list of all combinations of compressors,\ncompressor-specific options, and compression levels to test. Example: ``('xz',\n'-e', '-6')``.\n\n+--------------+------+---------------------------------------+\n| Name         | Type | Description                           |\n+==============+======+=======================================+\n| *compressor* | TEXT | The name of the compressor            |\n+--------------+------+---------------------------------------+\n| *options*    | TEXT | The options to execute the compressor |\n|              |      | with (if any)                         |\n+--------------+------+---------------------------------------+\n| *level*      | TEXT | The compression level to use          |\n+--------------+------+---------------------------------------+\n\nViews that derive from this table are **compressors** (which simply lists\ndistinct *compressor* values), and **options** (which lists distinct\n*compressor* and *options* combinations).\n\n\nresults\n-------\n\nThis is the \"main\" table, storing the results of all compression runs. It is\nkeyed by the machine's label, architecture, the compressor being tested, and\nits command line options. The non-key attributes track the success of the\noperation(s), the time they took, the maximum resident memory used, and the\ncompression ratio achieved.\n\n+-----------------+--------------+-------------------------------------------+\n| Name            | Type         | Description                               |\n+=================+==============+===========================================+\n| *machine*       | TEXT         | The label provided on by ``--machine`` on |\n|                 |              | the command line                          |\n+-----------------+--------------+-------------------------------------------+\n| *arch*          | TEXT         | The ``dpkg`` architecture of the machine  |\n+-----------------+--------------+-------------------------------------------+\n| *compressor*    | TEXT         | The name of the compressor                |\n+-----------------+--------------+-------------------------------------------+\n| *options*       | TEXT         | The options to execute the compressor     |\n|                 |              | with (if any)                             |\n+-----------------+--------------+-------------------------------------------+\n| *level*         | TEXT         | The compression level to use              |\n+-----------------+--------------+-------------------------------------------+\n| succeeded       | INTEGER      | 1 if the compression run succeeded, and 0 |\n|                 |              | if it failed                              |\n+-----------------+--------------+-------------------------------------------+\n| comp_duration   | NUMERIC(8,2) | The number of seconds compression took    |\n|                 |              | (wall clock time)                         |\n+-----------------+--------------+-------------------------------------------+\n| comp_max_mem    | INTEGER      | The maximum resident memory during        |\n|                 |              | compression, in bytes                     |\n+-----------------+--------------+-------------------------------------------+\n| decomp_duration | NUMERIC(8,2) | The number of seconds decompression took  |\n|                 |              | (wall clock time)                         |\n+-----------------+--------------+-------------------------------------------+\n| decomp_max_mem  | INTEGER      | The maximum resident memory during        |\n|                 |              | decompression, in bytes                   |\n+-----------------+--------------+-------------------------------------------+\n| input_size      | INTEGER      | The size of the input file provided       |\n+-----------------+--------------+-------------------------------------------+\n| output_size     | INTEGER      | The size of the compressed output         |\n+-----------------+--------------+-------------------------------------------+\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwaveform80%2Fcompression","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwaveform80%2Fcompression","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwaveform80%2Fcompression/lists"}