Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/airsequel/double-x-encoding
Encoding scheme to encode any Unicode string with only [0-9a-zA-Z_]. Similar to URL percent-encoding. Especially useful for GraphQL ID generation.
https://github.com/airsequel/double-x-encoding
decoding elm encoding encoding-scheme graphql haskell id javascript typescript
Last synced: 3 months ago
JSON representation
Encoding scheme to encode any Unicode string with only [0-9a-zA-Z_]. Similar to URL percent-encoding. Especially useful for GraphQL ID generation.
- Host: GitHub
- URL: https://github.com/airsequel/double-x-encoding
- Owner: Airsequel
- License: isc
- Created: 2022-11-18T12:23:54.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-15T20:32:43.000Z (9 months ago)
- Last Synced: 2024-10-12T10:22:52.422Z (3 months ago)
- Topics: decoding, elm, encoding, encoding-scheme, graphql, haskell, id, javascript, typescript
- Language: Elm
- Homepage: https://buttondown.email/Airsequel/archive/announcing-double-x-encoding-encode-any-utf-8/
- Size: 102 KB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: readme.md
- License: license.txt
Awesome Lists containing this project
README
# Double X Encoding
Encoding scheme to encode any Unicode string
with only characters from `[0-9a-zA-Z_]`.
Therefore it's quite similar to URL percent-encoding.
It's especially useful for GraphQL ID generation.Constraints for the encoding scheme:
1. Common IDs like `file_format`, `fileFormat`, `FileFormat`,
`FILE_FORMAT`, `__file_format__`, β¦ must not be altered
1. Support all Unicode characters
1. Characters of the ASCII range must lead to shorter encodings
1. Optional support for encoding leading digits (like in `1_file_format`)
to fulfill constraints of some ID schemes (e.g. GraphQL's).## Examples
Input | Output
------|-------
`camelCaseId` | `camelCaseId`
`snake_case_id` | `snake_case_id`
`__Schema` | `__Schema`
`doxxing` | `doxxing`
`DOXXING` | `DOXXXXXXING`
`id with spaces` | `idXX0withXX0spaces`
`id-with.special$chars!` | `idXXDwithXXEspecialXX4charsXX1`
`id_with_ΓΌmlΓ€utΓ` | `id_with_XXaaapmmlXXaaaoeutXXaaanp`
`Emoji: π ` | `EmojiXXGXX0XXbpgaf`
`Multi Byte Emoji: π¨βπ¦²` | `MultiXX0ByteXX0EmojiXXGXX0XXbpegiXXacaanXXbpjlc`
`\u{100000}` | `XXYbaaaaa`
`\u{10ffff}` | `XXYbapppp`With encoding of leading digit and double underscore activated
(necessary for GraphQL ID generation):Input | Output
------|-------
`1FileFormat` | `XXZ1FileFormat`
`__index__` | `XXRXXRindexXXRXXR`## Explanation
The encoding scheme is based on the following rules:
1. All characters in `[0-9A-Za-z_]` except for `XX` are encoded as is
1. `XX` is encoded as `XXXXXX`
1. All other printable characters inside the ASCII range
are encoded as a sequence of 3 characters: `XX[0-9A-W]`
1. All other Unicode code points until `U+fffff` (e.g. Emojis)
are encoded as a sequence of 7 characters:
`XX[a-p]{5}`, where the 5 characters are the hexadecimal representation
with an alternative hex alphabet ranging from
`a` to `p` instead of `0` to `f`.
1. All Unicode code points in the Supplementary Private Use Area-B
(`U+100000` to `U+10ffff`) are encoded as a sequence of 9 characters:
`XXY[a-p]{6}`If the optional leading digit encoding is enabled,
a leading digit is encoded as `XXZ[0-9]`.If the optional double underscore encoding is enabled,
double underscores are encoded as `XXRXXR`.## Installation
- Haskell: [Via Hackage](https://hackage.haskell.org/package/double-x-encoding)
- Other languages: \
The code is not yet available via common package managers.
Please copy the code into your project for the time being.