{"id":21229579,"url":"https://github.com/kosarev/escapeless","last_synced_at":"2025-10-20T01:04:51.216Z","repository":{"id":57426916,"uuid":"189721084","full_name":"kosarev/escapeless","owner":"kosarev","description":"Efficient binary encoding for large alphabets","archived":false,"fork":false,"pushed_at":"2020-10-17T09:01:55.000Z","size":28,"stargazers_count":25,"open_issues_count":5,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-10-16T09:13:02.548Z","etag":null,"topics":["ascii85","base122","base16","base32","base64","binary-encoding","uuencoding","yenc","z85"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kosarev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-06-01T10:43:39.000Z","updated_at":"2024-09-30T11:52:58.000Z","dependencies_parsed_at":"2022-09-19T06:00:42.860Z","dependency_job_id":null,"html_url":"https://github.com/kosarev/escapeless","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kosarev%2Fescapeless","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kosarev%2Fescapeless/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kosarev%2Fescapeless/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kosarev%2Fescapeless/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kosarev","download_url":"https://codeload.github.com/kosarev/escapeless/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225646198,"owners_count":17501837,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ascii85","base122","base16","base32","base64","binary-encoding","uuencoding","yenc","z85"],"created_at":"2024-11-20T23:28:27.251Z","updated_at":"2025-10-20T01:04:51.127Z","avatar_url":"https://github.com/kosarev.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# escapeless\nEfficient binary encoding for large alphabets.\n\n[![Build Status](https://travis-ci.org/kosarev/escapeless.svg?branch=master)](https://travis-ci.org/kosarev/escapeless)\n\n### Features\n\n* Low fixed-size overhead.\n* Compression-friendly output.\n* Arbitrary alphabets.\n* Fast and simple algorithm.\n* Does not involve heavy-weight arithmetic.\n\n\n### Comparison chart\n\n| Encoding         | Alphabet Size | Overhead |\n| ---------------- | ------------- | -------- |\n| escapeless255    | 255           |     0.4% |\n| escapeless254    | 254           |     0.8% |\n| escapeless253    | 253           |     1.2% |\n| [yEnc](http://www.yenc.org/yenc-draft.1.3.txt)          | 252 | 1.6%*, 0-100% |\n| escapeless252    | 252           |     1.6% |\n| escapeless251    | 251           |     2.0% |\n| escapeless250    | 250           |     2.4% |\n| [B-News](http://b-news.sourceforge.net/)                | 224 | 2.5% |\n| escapeless240    | 240           |     6.7% |\n| escapeless230    | 230           |    11.4% |\n| escapeless225    | 225           |    13.8% |\n| [Base122](http://blog.kevinalbs.com/base122)            | 122 | 14.3% |\n| [basE91](http://base91.sourceforge.net/)                |  91 | 22%*, 14-23% |\n| [Base94](https://gist.github.com/iso2022jp/4054241)     |  94 | 22.2% |\n| [Ascii85](https://en.wikipedia.org/wiki/Ascii85)        |  85 | 25.0% |\n| [Z85](https://rfc.zeromq.org/spec:32/Z85/)              |  85 | 25.0% |\n| [Base64](https://en.wikipedia.org/wiki/Base64)          |  64 | 33.3% |\n| [uuencode](https://en.wikipedia.org/wiki/Uuencoding)    |  64 | 33.3% |\n| [Base58](https://en.wikipedia.org/wiki/Base58)          |  58 | 36.6% |\n| [Base36 / 64-bit](https://en.wikipedia.org/wiki/Base36) |  36 | 59.2%*, 0-62.5% |\n| [Base32](https://en.wikipedia.org/wiki/Base32)          |  32 | 60.0% |\n| [Base36 / 32-bit](https://en.wikipedia.org/wiki/Base36) |  36 | 62.0%*, 0-75% |\n| [Base16](https://en.wikipedia.org/wiki/Base16)          |  16 | 100.0% |\n\n(*) On uniform distribution of input octets.\n\n\n### Building and testing\n\n```shell\n$ git clone git@github.com:kosarev/escapeless.git\n$ cd c\n$ make\n$ make test\n```\n\n\n### Basic idea\n\nGiven a source alphabet of size S and a target alphabet of size\nN \u003c S, break the sequence of input characters into blocks so that\nthe number of characters in each block does not exceed N − 1.\n\nSince a block can contain at most N − 1 different characters and\nthe target alphabet contains N characters, it is known that all\nthose used characters can be mapped to the target alphabet and at\nleast one extra character of the target alphabet will remain\nunmapped.\nFor example:\n```\n A B C D E F G H I J K L    12  Characters of the source alphabet (S)\n A   C D E     H I   K L     8  Characters of the target alphabet (N)\n   x       x x     x         4  Characters missing in the target alphabet (takeouts)\n   | | | |     | | |         7  Characters used in the block\n .         . .       . .     5  Characters not used in the block\n```\n\nHere, one possible mapping is:\n```\n B −\u003e A\n J −\u003e K\n```\nwith `L` left unmapped and all other characters of the target\nalphabet mapped to themselves.\n\nWhat that unmapped character is for, is to make it possible to\nmap unused takeouts, like `F` and `G` in the example, to a\ncharacter of the target alphabet that does not represent any\ncharacters of the source alphabet for that block.\nTaking that into account, here's how a complete mapping would\nlook:\n```\n B −\u003e A\n F -\u003e L\n G -\u003e L\n J −\u003e K\n```\n\nOnce the mapping is determined, we can output the encoded block\nwith takeout characters in it replaced with members of the target\nalphabet.\nTo let a decoder know the mapping, we also have to prepend each\nof the encoded blocks with a series of characters the takeouts\nare mapped to and assume that the decoder will be given the same\nset of takeout characters specified in the same order.\n\n\n### Overhead formula\n\nFor a source alphabet of size S, a target alphabet of size N and\na block of N − 1 characters, the size of the encoded block is:\n```\n encoded_block_size = takeouts_map_size + block_size =\n                      (S − N) + (N - 1) =\n                      S - 1\n```\n\nThe overhead is thus:\n```\n overhead = (encoded_block_size - block_size) / block_size =\n            ((S - 1) - (N - 1)) / (N - 1) =\n            (S - 1 - N + 1) / (N - 1) =\n            (S - N) / (N - 1)\n```\n\n\n### Encoding algorithm\n\n1. Break the input message into blocks so that no block contains\n   more than N - 1 characters, where N is the size of the target\n   alphabet.\n   Process every block separately as specified below.\n\n2. Map every takeout character to a character of the target\n   alphabet that is not used in the block and is not a takeout\n   character.\n   All takeouts not used in the block shall map to the same\n   character.\n\n3. Replace takeout characters of the block using that map.\n\n4. Output the map followed by the rewritten block.\n\n\n### Decoding algorithm\n\n1. Read the takeouts map and the encoded block.\n\n3. Using the map, restore the takeouts in the block.\n\n3. Output decoded block.\n\n\n### The idea explained in greater detail\n\n[Escapeless, Restartable, Binary Encoding](http://chilliant.blogspot.com/2020/01/escapeless-restartable-binary-encoding.html)\n\nThanks, Ian!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkosarev%2Fescapeless","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkosarev%2Fescapeless","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkosarev%2Fescapeless/lists"}