{"id":16340849,"url":"https://github.com/casey/datacode","last_synced_at":"2025-03-23T00:32:26.339Z","repository":{"id":231030873,"uuid":"780705624","full_name":"casey/datacode","owner":"casey","description":"visually compact binary data","archived":false,"fork":false,"pushed_at":"2024-04-02T04:20:38.000Z","size":10,"stargazers_count":11,"open_issues_count":0,"forks_count":2,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-18T15:49:02.763Z","etag":null,"topics":["binary-encoding","text-encoding","unicode"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"cc0-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/casey.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2024-04-02T02:15:31.000Z","updated_at":"2024-12-30T22:29:27.000Z","dependencies_parsed_at":"2024-04-02T04:32:16.399Z","dependency_job_id":null,"html_url":"https://github.com/casey/datacode","commit_stats":null,"previous_names":["casey/datacode"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casey%2Fdatacode","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casey%2Fdatacode/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casey%2Fdatacode/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casey%2Fdatacode/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/casey","download_url":"https://codeload.github.com/casey/datacode/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245040235,"owners_count":20551297,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["binary-encoding","text-encoding","unicode"],"created_at":"2024-10-10T23:58:02.748Z","updated_at":"2025-03-23T00:32:25.938Z","avatar_url":"https://github.com/casey.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"datacode\n========\n\nDatacode is a proposal for a visually compact encoding of binary data in plain\ntext.\n\nIt proposes allocating the 65,536 Unicode plane four code points,\nU+40000–4FFFF, to visual representations of all possible 16-bit values, and\ncode points U+1FF00–1FFFF to visual representations of all possible 8-bit\nvalues.\n\nThese two character ranges would allow visually compact plain-text\nrepresentations of binary data. With code points U+40000–4FFFF the leading\npairs of bytes, with an optional U+1FF00–1FFFF code point for the last byte, if\nthe number of bytes is odd.\n\nmotivation\n----------\n\nText representations of binary data are ubiquitous, with a great variety of\ndifferent representations commonly in use. Uses include cryptographic hashes,\npublic keys, web content IDs, for example, YouTube video IDs, and many more.\n\nExamples of binary-to-text encoding schemes include:\n\n- Hexadecimal, which uses characters in the set `[0-9a-f]` to encode four bits.\n\n- Base64, which uses a variety of 64 character sets to encode six bits.\n  `[0-9a-zA-Z+/]`, is a common choice of characters, but many variations exist.\n  Since each characters encodes six bits, a padding character, commonly =, is\n  sometimes used to indicate that the final bits should be discarded after\n  encoding.\n\n- bech32, primarily used to encode Bitcoin addresses, which implements a BCH\n  code over the characters [qpzry9x8gf2tvdw0s3jn54khce6mua7l], with each\n  character encoding five bits.\n\nHowever, no character set dedicated to representing binary data exists.\n\nproposal\n--------\n\nCode points U+40000–4FFFF are allocated to visual representations of all 65,536\npossible two-byte values, and are called \"paircodes\". Code points 1FF00–1FFFF\nare allocated to visual representations of all 256 one byte values, and are\ncalled \"bytecodes\".\n\nAn N byte sequence can then be represented with N / 2 paircode characters,\nfollowed by a single bytecode if N is odd.\n\nBinary data encoded as datacode uses 75% fewer characters than hexadecimal, and\n62.5% fewer characters than Base64.\n\nThe same 32 byte hash encoded as hex:\n\n```\n4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b\n```\n\nVersus Base64:\n\n```\nSl4eS6q4nzoyUYqIwxvIf2GPdmc+LMd6shJ7ev3tozs=\n```\n\nVersus datacode, using the `❑` character as a placeholder:\n\n```\n❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑\n```\n\nrendering\n---------\n\nEach paircode is represented as a four-by-four grid of cells, with an empty\ncell representing the binary digit 0, and a filled cell representing the binary\ndigit 1:\n\n```\n┏━━━━┓\n┃....┃\n┃....┃\n┃....┃\n┃....┃\n┗━━━━┛\n```\n\nEach column of a paircode grid represents 4 bits, with columns arranged left to\nright from least significant to most significant, and bits within columns\narranged top to bottom from least significant to most significant.\n\nEach bytecode is represented similarly as a two-by-four grid of cells:\n\n```\n┏━━┓\n┃..┃\n┃..┃\n┃..┃\n┃..┃\n┗━━┛\n```\n\nThe three bytes encoded in hex as `4a5e1e`:\n\n```\n┏━━━━┓┏━━┓\n┃..█.┃┃█.┃\n┃.█.█┃┃.█┃\n┃█.██┃┃.█┃\n┃.█.█┃┃.█┃\n┗━━━━┛┗━━┛\n```\n\nFilled cells are represented as solid squares, and empty cells as small dots.\nThis prevents empty cells from rendering as blanks.\n\nFonts for datacode characters are easy to produce, since all characters can be\ngenerated programmatically. Presumably, not all fonts would include datacode\ncharacters, and rendering would be handled by a specialized fallback font for\nthe code point ranges.\n\nTo allow for easier visual comparison of datacode strings, each column can be\nrendered in one of sixteen colors, ord column pair in one of 256 colors,\ndepending on its value.\n\nIt may be advantageous to instead represent paircodes as two-by-two grids of\nsmall hexidecimal digits, and bytecodes as one-by-two grids of hexidecimal\ndigits, to allow datacode values to be read and transliterated to hexidecimal\nby humans.\n\napplications\n------------\n\nDatacode can be used to to compactly represent binary data in applications\nwhich expose binary data in text to the end user. It is not appropriate for use\nin non user-facing applications, since the actual encoding of datacode as\nUTF-8, is larger than the equivalent binary or hex encoding, given the size of\nthe datacode code points.\n\nremarks\n-------\n\nA substantial drawback of datacode is that it cannot be easily be typed by a\nless than highly motivated user. However, in nearly all applications using text\nrepresentations of binary data, in nearly all cases, those representations are\nnot intended to be typed, and are much more likely to simply be present in URLs\nand other places where the primary form of interaction is to copy and paste the\ntext.\n\nDatacode's more compact encoding of binary data allows it to be more easily\ncopied and pasted, and fit more compactly within other text.\n\nAdditionally, text interfaces could provide for special handling of datacode\nsequences, for example, making a single click anywhere within the sequence\nselect the entire contiguous sequence of datacode characters.\n\nDatacode could also improve accessibility, by clearly delineating\nhuman-readable text from text representations of binary data.\n\nencoding\n--------\n\nEncoding and decoding is straightforward. A [Rust implementation](src/lib.rs)\nis provided.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcasey%2Fdatacode","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcasey%2Fdatacode","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcasey%2Fdatacode/lists"}