{"id":22018697,"url":"https://github.com/pasqu4le/huffer","last_synced_at":"2025-09-01T21:39:17.716Z","repository":{"id":110827274,"uuid":"120478649","full_name":"pasqu4le/huffer","owner":"pasqu4le","description":"A Haskell compressor based on canonical Huffman codes","archived":false,"fork":false,"pushed_at":"2018-02-09T13:21:59.000Z","size":1067,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-05-07T03:34:29.555Z","etag":null,"topics":["cli-app","compression","haskell","huffman-coding"],"latest_commit_sha":null,"homepage":null,"language":"Haskell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pasqu4le.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-02-06T15:26:03.000Z","updated_at":"2023-03-24T03:01:44.000Z","dependencies_parsed_at":null,"dependency_job_id":"44166e37-f1aa-4f50-a8b3-347defd58d1b","html_url":"https://github.com/pasqu4le/huffer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/pasqu4le/huffer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pasqu4le%2Fhuffer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pasqu4le%2Fhuffer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pasqu4le%2Fhuffer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pasqu4le%2Fhuffer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pasqu4le","download_url":"https://codeload.github.com/pasqu4le/huffer/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pasqu4le%2Fhuffer/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":268655110,"owners_count":24285128,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-04T02:00:09.867Z","response_time":79,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli-app","compression","haskell","huffman-coding"],"created_at":"2024-11-30T05:13:21.803Z","updated_at":"2025-08-04T05:38:02.902Z","avatar_url":"https://github.com/pasqu4le.png","language":"Haskell","readme":"# Huffer\n\nHuffer is a small compressor and decompressor that uses [canonical Huffman codes](https://en.wikipedia.org/wiki/Canonical_Huffman_code).\n\nI wrote this as a \"toy\" project to explore and learn more about the Haskell programming language.\n\nCompiling and running\n------------------------\nYou can use [cabal](https://www.haskell.org/cabal/) to compile and run huffer.\n\nCompile and run with:\n```\n$ cabal run\n```\nJust compile with:\n```\n$ cabal configure\n$ cabal build\n```\nInstall with:\n```\n$ cabal install\n```\nCommand Line Arguments\n----------------------\nYou can see the command line arguments by running `huffer help`, that will tell you:\n\n```\nrun with: huffer action [inputs] (to output)\n\naction can be 'encode', 'decode' or 'content'\nyou have to specify at least one input file (or folder) to encode\nyou can specify only an input file to decode or list content of\nyou can specify an output file for encoding (or folder for decoding)\n  if you don't, huffer will use 'output.huf' for encoding ('.' for decoding)\n```\nFor example, if you run `huffer encode movies/ clips/ to vids.huf` it will compress every file contained in the _movies_ and _clips_ directories (and every one of their subdirectories) and compress all of them in a file called _vids.huf_.\n\nYou can then run `huffer content vids.huf` and it will tell you the files that _vids.huf_ contains or run `huffer decode vids.huf to media` and it will decompress every file contained in _vids.huf_ in the _media_ directory.\n\nHow files are compressed\n------------------------\nFor each file to compress huffer will (naively) read the file and count the frequencies of every word (that has the size of a single byte).\n\nFor each file it will then calculate the Huffman code, make it canonical and finally read, compress and write them one after another.\n\nEspecially for this double-reading Huffer __it's not very fast__, but because it uses __lazy bytestrings__ it can compress files of any size in almost constant memory.\n\nHow compressed files are stored\n-------------------------------\nHuffer stores all the files it compressed into an archive file that starts with a __header__, structured like this:\n\n| Bytes         | Description                               |\n|:-------------:| ----------------------------------------- |\n| 1             | Body type: defines how the body is stored |\n| 2             | Number of entries: __n__                  |\n\nFollowed by __n__ entries (one per file), each structured like this:\n\n| Bytes         | Description                                               |\n|:-------------:| --------------------------------------------------------- |\n| 4             | The size (in bytes) the file had originally               |\n| 4             | The size (in bytes) the file has after compression: __m__ |\n| 2             | The length of the file path: __l__                        |\n| __l__         | The string containing the path of the file                |\n\nThe header is followed by the __body__ that contains for each one of the __n__ files listed in the header (in that same order) __m__ bytes of compressed data.\n\n\u003e NOTE: At the moment the Body Type byte is always set to 0 and not really considered because there is only one body implementation (others will follow if and when I will keep playing with this).\n\nThe body of a compressed file consists of 256 bytes, each containing the __number of bits__ for every possible word, in alphabetical order (see the [wiki page](https://en.wikipedia.org/wiki/Canonical_Huffman_code#Encoding_the_codebook) for a better explanation) followed by the actual data.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpasqu4le%2Fhuffer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpasqu4le%2Fhuffer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpasqu4le%2Fhuffer/lists"}