{"id":35269004,"url":"https://github.com/ekcbw/cxx-huffman-simple","last_synced_at":"2026-05-21T02:07:38.162Z","repository":{"id":275920655,"uuid":"899931626","full_name":"ekcbw/cxx-huffman-simple","owner":"ekcbw","description":"A lightweight C++ implementation of Huffman encoding compression. 一个简单的C++霍夫曼编码压缩实现。","archived":false,"fork":false,"pushed_at":"2026-01-03T07:22:52.000Z","size":71,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-08T10:38:12.950Z","etag":null,"topics":["c-plus-plus","compression","huffman-coding","huffman-compression-algorithm","huffman-compressor","huffman-decoder","huffman-encoder"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ekcbw.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-12-07T12:11:41.000Z","updated_at":"2026-01-03T07:22:56.000Z","dependencies_parsed_at":null,"dependency_job_id":"c17a6f71-db71-426b-b9f9-1707688ecf91","html_url":"https://github.com/ekcbw/cxx-huffman-simple","commit_stats":null,"previous_names":["qfcy/cxx-huffman-simple","ekcbw/cxx-huffman-simple"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ekcbw/cxx-huffman-simple","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ekcbw%2Fcxx-huffman-simple","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ekcbw%2Fcxx-huffman-simple/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ekcbw%2Fcxx-huffman-simple/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ekcbw%2Fcxx-huffman-simple/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ekcbw","download_url":"https://codeload.github.com/ekcbw/cxx-huffman-simple/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ekcbw%2Fcxx-huffman-simple/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33285048,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-20T15:12:43.734Z","status":"online","status_checked_at":"2026-05-21T02:00:07.181Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c-plus-plus","compression","huffman-coding","huffman-compression-algorithm","huffman-compressor","huffman-decoder","huffman-encoder"],"created_at":"2025-12-30T11:49:49.071Z","updated_at":"2026-05-21T02:07:38.157Z","avatar_url":"https://github.com/ekcbw.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"**The English introduction is placed below the Chinese version.**\n\n一个简单的C++霍夫曼编码实现，将输入二进制文件数据压缩成`.compressed`格式，并支持解压。  \n\n#### 命令行\n编译：`g++ huffman.cpp -o bin\\huffman -O2 -s -Wall`。  \n压缩：`huffman file.txt`，程序会生成`file.txt.compressed`作为压缩的结果。  \n解压：`huffman file.txt.compressed`，程序会生成`file.txt`。如果`file.txt`已存在，则会提示替换。  \n\n#### 文件结构与格式\n\n文件头部分：  \n- 解压后数据大小`n` (8字节无符号整数)  \n- 霍夫曼树位图占用的字节数`m` (2字节`ushort`)  \n- 霍夫曼树位图，用二进制位表示 (`m`字节，0表示内部节点，1表示叶子结点，详见下文)  \n- 霍夫曼树每个节点代表的字符信息，越前面的字符位于树的越浅层，出现的频率越高 （`m`字节，例如：`\u003c空格\u003eeonsa\\nritpd)(;clfug*\u003c\u003e+bz/,_m ...`）\n\n文件体部分：  \n- 霍夫曼编码的原始文件数据 (任意大小，但解压后的数据需要达到n字节，否则被视为文件截断)  \n\n#### 霍夫曼树位图的存储\n\n这个项目优化了霍夫曼树位图的存储空间，也就是遇到叶子结点之后不再往下存储。  \n传统的存储（a[2\\*n+1]和a[2\\*n+2]表示a[n]的子节点）：  \n```\n0\n10\n0011\n```\n如果霍夫曼树有十几层深，霍夫曼树的大小就会达到KB级别，这是不可接受的。  \n由于叶子结点1下面不可能存在新的叶子结点，这里的实现省略了多余的底部内部节点：  \n```\n0\n10\n11\n```\n第三层的11连接到第二层的0。这里省略了两个00，优化了空间。  \n\n\nA lightweight C++ Huffman encoding implementation that compresses input binary file data into the `.compressed` format and supports decompression.\n\n#### Command Line\n- **Compiling**: `g++ huffman.cpp -o huffman -O2 -s -Wall`.  \n- **Compression**: `huffman file.txt`  \n  The program generates `file.txt.compressed` as the compressed result.  \n- **Decompression**: `huffman file.txt.compressed`  \n  The program generates `file.txt`. If `file.txt` already exists, the program will prompt for replacement.\n\n#### File Structure and Format\n\n**File Header**:  \n- Size of decompressed data `n` (8-byte unsigned long long)  \n- Size of the Huffman tree bitmap in bytes `m` (2-byte unsigned short)  \n- Huffman tree bitmap represented in binary bits (`m` bytes, where `0` represents an internal node and `1` represents a leaf node, see details below)  \n- Character information for each node in the Huffman tree. Characters appearing earlier in the list are closer to the root of the tree and have higher frequencies (`m` bytes, e.g., `\u003cspace\u003eeonsa\\nritpd)(;clfug*\u003c\u003e+bz/,_m ...`)  \n\n**File Body**:  \n- Huffman-encoded original file data (arbitrary size, but the decompressed data must match `n` bytes; otherwise, the file is considered truncated)\n\n#### Storage of the Huffman Tree Bitmap\n\nThis project optimizes the storage space for the Huffman tree bitmap by omitting unnecessary internal nodes after encountering leaf nodes.  \n\n**Traditional Storage** (where `a[2*n+1]` and `a[2*n+2]` represent the child node of `a[n]`):  \n```\n0\n10\n0011\n```\nIf the Huffman tree has a depth of more than 12 levels, its size could reach kilobytes, which is unacceptable.  \n\n**Optimized Storage**:  \nSince no new leaf nodes can exist below a leaf node (`1`), the code omits redundant internal nodes at the bottom:  \n```\n0\n10\n11\n```\nIn this example, the third level's `11` connects to the second level's `0`. Two `00` nodes are omitted, optimizing space usage.  \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fekcbw%2Fcxx-huffman-simple","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fekcbw%2Fcxx-huffman-simple","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fekcbw%2Fcxx-huffman-simple/lists"}