{"id":27774309,"url":"https://github.com/cnlohr/vpxcoding","last_synced_at":"2025-08-16T16:09:42.295Z","repository":{"id":269807110,"uuid":"907588806","full_name":"cnlohr/vpxcoding","owner":"cnlohr","description":"Zero-dependency single-file C header for VPX coding, a form of Arithmetic coding.","archived":false,"fork":false,"pushed_at":"2025-08-12T07:38:05.000Z","size":83,"stargazers_count":18,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-08-12T09:27:04.096Z","etag":null,"topics":["compression","compression-algorithm","compression-implementations","compression-library"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cnlohr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-12-24T00:05:53.000Z","updated_at":"2025-08-12T07:38:08.000Z","dependencies_parsed_at":"2025-04-30T02:18:23.003Z","dependency_job_id":"39285fe6-3db0-4ce3-92ed-933009bcefd2","html_url":"https://github.com/cnlohr/vpxcoding","commit_stats":null,"previous_names":["cnlohr/vpxcoding"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/cnlohr/vpxcoding","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cnlohr%2Fvpxcoding","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cnlohr%2Fvpxcoding/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cnlohr%2Fvpxcoding/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cnlohr%2Fvpxcoding/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cnlohr","download_url":"https://codeload.github.com/cnlohr/vpxcoding/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cnlohr%2Fvpxcoding/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270735299,"owners_count":24636338,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-16T02:00:11.002Z","response_time":91,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compression","compression-algorithm","compression-implementations","compression-library"],"created_at":"2025-04-30T02:18:17.403Z","updated_at":"2025-08-16T16:09:42.283Z","avatar_url":"https://github.com/cnlohr.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# vpxcoding single-file-header C library\n\n**WIP Note** - This offshoot has not been battle-hardened, and is subject to change.  Also, hopefully in time there will be more complete, practical examples.\n\nSingle file header form of the range coder from [libvpx](https://github.com/webmproject/libvpx) (From the video codec VP8/VP9) as a general purpose compression/decompression of bitstreams algorithm.  [Range Coding](https://en.wikipedia.org/wiki/Range_coding) is a type of [Arithmatic Coding](https://en.wikipedia.org/wiki/Arithmetic_coding), able to offer even better compression than the provably optmal [Huffman Coding](https://en.wikipedia.org/wiki/Huffman_coding) because it can represent symbols using partaial numbers of bits.\n\nThe idea of this coding is, given:\n\n1. A bitstream\n2. Knowledge about how likely the next bit is to be a 0 or 1 (written as a probability from 0..255)\n\nYou can optimally code an output bitstream, with compression better than huffman trees by using arithmetic coding.  Please note this is **not** a replacement for something like lz77, zstd, zlib, etc.  But **is** a replacement for huffman coding.  This ONLY covers optimal symbol expression.  If there is data-similarity that must be compressed by another algorithm.  In general, you will want to get rid of whatever entropy you can before applying this compression technique. I.e. you can't just use this to compress text.  If you are looking for something for that, you may want to consider my [heatshrink single-file-header](https://github.com/cnlohr/heatshrink-sfh).\n\nIt's also reaonsably fast. Not great, but not bad.\n\n```\nInput  Len: 16777216 bytes\nOutput Len: 14104056 bytes\nRelative Size: 84.07 %\nMatching 16777216 bytes\nEncode Time:  375.116ms (42.653 MBytes/s)\nDecode Time: 537.677ms (29.758 MBytes/s)\n```\n(on a AMD Ryzen 7 5800X, GCC 11.4.0, -O2) \n\nAlso, the code is very small, about 768 bytes each for reading and writing when compiled. (below, using -Os) x64.\n```\n.rodata\t0100 (256 bytes)  vpx_norm           // Table used for both encode and decode\n\n.text\t003f (63 bytes)   vpx_start_encode\n.text\t00e4 (228 bytes)  vpx_write\n.text\t005e (94 bytes)   vpx_stop_encode\n\n.text\t0073 (115 bytes)  vpx_read\n.text\t00fa (250 bytes)  vpx_reader_fill\n.text\t003f (35 bytes)   vpx_reader_find_end\n.text\t0066 (102 bytes)  vpx_reader_init\n.text\t0016 (22 bytes)   vpx_reader_has_error\n```\n\nIf you are on a platform that supports `__builtin_clz`, then you may want to define `VPXCODING_NOTABLE` as that will replace the table call with a `clz` and `andi` operation, which may be faster, and use less cache/RAM.  If you are on a RAM constrained system, you may want to do this as well, but see the note in the header file about the manually unwound log2.\n\nIn my tests, depending on the application, this seems to be able to save between 1-5% over huffman trees.  But, notably, there are situations where you can use this to much greater effect and simplicity than huffman trees (but not all situations).\n\n## Example\n\nIt's very simple, if you have a bitstream you want to encode, you can write something like:\n\n```c\n#define VPXCODING_WRITER\n#include \"vpxcoding.h\"\n\nint main()\n{\n\tvpx_writer w;\n\tuint8_t * bufferO = malloc(2048);\n\tvpx_start_encode( \u0026w, bufferO, 2048);\n\tvpx_write(\u0026w, 1, 10 );\n\tvpx_write(\u0026w, 0, 250);\n\tvpx_write(\u0026w, 1, 128);\n\tvpx_stop_encode(\u0026w);\n\n\t// w.pos contains the # of bytes written into bufferO.\n}\n```\n\nOr to read, \n```c\n#define VPXCODING_READER\n#include \"vpxcoding.h\"\n\nint main()\n{\n\t// Assume a uint8_t* bufferI, and it's length, bufferIlen\n\tvpx_reader reader;\n\tint ret = vpx_reader_init(\u0026reader, bufferI, bufferIlen, 0, 0 );\n\n\tint first =  vpx_read(\u0026reader, 10);  // 1\n\tint second = vpx_read(\u0026reader, 250); // 0\n\tint third =  vpx_read(\u0026reader, 128); // 1\n}\n```\n\n\nIt appears the optimal bit selection is:\n\n```c\n\t// Higher numbers mean 0 is more likely, 0.5 = equally possible 1 and 0.\n\tfloat probability01 = (chance of next bit being 0);\n\tint probability = probability01 * 257 - 0.5;\n\n\t// Must bound.  Probability is only a uint8_t\n\tif( probability \u003e 255 ) probability = 255;\n\tif( probability \u003c 0 ) probability = 0;\n```\n\nNOTE: This is found emperically.  It may not be correct or as-designed.\n\n\n## Overall Properties\n\n![Optimal Compression Ratio](https://github.com/user-attachments/assets/02b9d48f-497c-4633-87b8-42a0e345aeaa)\n\n![Overall](https://github.com/user-attachments/assets/55d98d1d-9fc9-4bb2-a436-16dd0fbc603d)\n\n![Edges](https://github.com/user-attachments/assets/c18f296a-d2af-4d7d-84a3-ef145f01a66c)\n\n![Optimal](https://github.com/user-attachments/assets/d2315457-68a6-460e-aaa2-73ba25c0b0aa)\n\n\n## Special Thanks\n\n\nWhoever the original author actually was.  At the very least thank you for making sure it's properly licensed.\n\n[@danielrh](https://github.com/danielrh) for alerting me to this, and helping me understand its application outside of libvpx.\n * https://github.com/danielrh/arithmetic_coding_tutorial \n * https://github.com/danielrh/losslessh264\n\n * This video is a GREAT introduction to Huffman, Arithmetic Coding, And ANS [Better than Huffman](https://www.youtube.com/watch?v=RFWJM8JMXBs)\n\nThese two youtube videos were really good, but don't explain this specific implementation.\n * [(IC 5.1) Arithmetic coding - introduction](https://www.youtube.com/watch?v=ouYV3rBtrTI)\n * [(IC 5.2) Arithmetic coding - Example #1](https://www.youtube.com/watch?v=7vfqhoJVwuc)\n * [(IC 5.3) Arithmetic coding - Example #2](https://www.youtube.com/watch?v=CXCWQy9N2ag)\n * [(IC 5.4) Why the interval needs to be completely contained](https://www.youtube.com/watch?v=jHS8-rmEo5k)\n * [(IC 5.5) Rescaling operations for arithmetic coding](https://www.youtube.com/watch?v=t8_198HHSfI)\n * This continues on for 13? episodes?\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcnlohr%2Fvpxcoding","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcnlohr%2Fvpxcoding","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcnlohr%2Fvpxcoding/lists"}