Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/zanaptak/binarytotextencoding

A binary-to-text encoder/decoder library for .NET and Fable. Provides base 16, base 32, base 46, base 64, and base 91 codecs. Supports custom character sets.
https://github.com/zanaptak/binarytotextencoding

base16 base32 base46 base64 base91 dotnet fable

Last synced: 3 months ago
JSON representation

A binary-to-text encoder/decoder library for .NET and Fable. Provides base 16, base 32, base 46, base 64, and base 91 codecs. Supports custom character sets.

Awesome Lists containing this project

README

        

# Zanaptak.BinaryToTextEncoding

[![GitHub](https://img.shields.io/badge/-github-gray?logo=github)](https://github.com/zanaptak/BinaryToTextEncoding) [![NuGet](https://img.shields.io/nuget/v/Zanaptak.BinaryToTextEncoding?logo=nuget)](https://www.nuget.org/packages/Zanaptak.BinaryToTextEncoding)

A binary-to-text encoder/decoder library for [.NET](https://dotnet.microsoft.com/) and [Fable](https://fable.io/). Provides base 16, base 32, base 46, base 64, and base 91 codecs. Supports custom character sets.

## Output example

Example of a random 16-byte array (same size as a GUID) encoded in each base:

- Base 16: `3A319D0D6BA340E8CFFA6E8F65236B71`
- Base 32: `HIYZ2DLLUNAORT72N2HWKI3LOE`
- Base 46: `G7YXHjqTF4THH7KYYxCBr4sM`
- Base 64: `OjGdDWujQOjP+m6PZSNrcQ`
- Base 91: `7M515sme(-[9YfN?/LIf`

## Encoded bits per character

The base values in this library have been chosen because they can encode an integral number of bits as either 1 or 2 characters, making the conversion relatively efficient since groups of bits can be directly converted using lookup arrays.

- Base 16: 4 bits per character
- Base 32: 5 bits per character
- Base 46: 5.5 bits per character (11 bits per 2 characters)
- Base 64: 6 bits per character
- Base 91: 6.5 bits per character (13 bits per 2 characters)

## Usage

Add the [NuGet package](https://www.nuget.org/packages/Zanaptak.BinaryToTextEncoding) to your project:
```
dotnet add package Zanaptak.BinaryToTextEncoding
```

### C#
```cs
using Zanaptak.BinaryToTextEncoding;

// Default codec
var originalBytes = new byte[] { 1, 2, 3 };
var encodedString = Base32.Default.Encode(originalBytes);
var decodedBytes = Base32.Default.Decode(encodedString);

// Custom character set
var customBase32 = new Base32("BCDFHJKMNPQRSTXZbcdfhjkmnpqrstxz");
var customOriginalBytes = new byte[] { 4, 5, 6 };
var customEncodedString = customBase32.Encode(customOriginalBytes);
var customDecodedBytes = customBase32.Decode(customEncodedString);

// Wrap output
var randomBytes = new byte[100];
new System.Random(12345).NextBytes(randomBytes);
Console.WriteLine(Base91.Default.Encode(randomBytes, 48));
// Output:
// r]g^oP{ZKd1>}lC{C*P){O96SL8z%0TW,4BfEof}%!b@a#:6
// nN}lC{C*P){O96SL8z%0TW,4BfEof}%!b@a#:6
// nN `XYZbcdfghjkmnpqrstvwxyz`
LettersCharacterSet | Excludes numbers and some confusable letters, ASCII-sortable | `ABCDEFGHJKMNPQRSTUVWXYZ`
`abcdefghjkmnpqrstuvwxyz`

Base64 | Description | Characters
:--- | :--- | :---
StandardCharacterSet | (Default) RFC 4648 section 4 | `ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef`
`ghijklmnopqrstuvwxyz0123456789+/`
UrlSafeCharacterSet | RFC 4648 section 5 | `ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef`
`ghijklmnopqrstuvwxyz0123456789-_`
UnixCryptCharacterSet | Unix crypt password hashes, ASCII-sortable | `./0123456789ABCDEFGHIJKLMNOPQRST`
`UVWXYZabcdefghijklmnopqrstuvwxyz`

Base91 | Description | Characters
:--- | :--- | :---
SortableQuotableCharacterSet | (Default) Excludes `"` `'` `\` characters, ASCII-sortable | ```!#$%&()*+,-./0123456789:;<=>?@A```
```BCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`a```
```bcdefghijklmnopqrstuvwxyz{\|}~```

Base91Legacy | Description | Characters
:--- | :--- | :---
LegacyCharacterSet | (Default) Original 'basE91' character set | ```ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef```
```ghijklmnopqrstuvwxyz0123456789!#```
```$%&()*+,./:;<=>?@[]^_`{\|}~"```

## Legacy 'basE91' compatibility

This library provides two base 91 implementations: `Base91` and `Base91Legacy`. They are not compatible; the encoded output of one cannot be decoded by the other.

The main `Base91` algorithm works like the other `BaseXX` algorithms in the library. It encodes with constant-width (each 2-character pair encodes exactly 13 bits) in big-endian order (most-significant character fist, representing the most-significant bits of the most-significant byte). The default character set is in ASCII order to preserve sortability of input, and excludes the characters `"`, `'`, and `\` to make it more easily quotable in programming languages.

`Base91Legacy` is based on the previously existing [basE91](http://base91.sourceforge.net/) algorithm. It encodes with a variable-width mechanism (some 2-character pairs can encode 14 bits instead of 13) which can result in slightly smaller encoded strings. Each two-character pair in the output is swapped compared to the main algorithm (least-significant char of the pair first), so sorting by string is not meaningful regardless of character set. Its default character set includes the `"` character, making it inconvenient to use in some programming languages and data formats such as JSON.

## Benchmarks

See the [benchmark project](https://github.com/zanaptak/BinaryToTextEncoding/tree/main/benchmark).