https://github.com/hkupty/debris
Deterministic Binary Serialization
https://github.com/hkupty/debris
Last synced: about 2 months ago
JSON representation
Deterministic Binary Serialization
- Host: GitHub
- URL: https://github.com/hkupty/debris
- Owner: hkupty
- License: apache-2.0
- Created: 2020-03-24T00:05:53.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2021-11-11T10:50:45.000Z (over 3 years ago)
- Last Synced: 2025-04-09T22:11:46.760Z (about 2 months ago)
- Language: Clojure
- Size: 10.7 KB
- Stars: 10
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# debris
---
Debris is an alpha-state serialization format that does *not* focus on speed or size.
Its focus is on being deterministic, simple and predictable.
It should write binary data in a way that:
- No setup is required;
- Any implementation generates the same result;
- Unordered collections are serialized in a sorted manner;
- Data is never lost, but the data types can be widened on de-serialization.In addition to usual scenarios for serializing data, this properties would allow any data serialized in debris to:
- Produce reliable hashes or signatures;
- Diff against other serialized chunk for updates;## Layout
Every data type in debris is serialized according to the same schema:
```
| 1 byte || 4 bytes || ... |
[type header][size in bytes][payload]
```This means that:
- The top-level collection can hold up to 4GB of data;
- Collections can hold heterogeneous data;
- Strings are serialized in utf-8;
- Numbers are serialized a as Decimal representation;The types are:
| Header Prefix | Data type | Obs |
|---|---|---|
| `0x00` | Byte/Bytes | |
| `0x01` | Boolean | |
| `0x02` | Number | Serialized as Decimal |
| `0x03` | Text | Serialized as UTF-8 |
| `0x10` | Unordered Sequence | Sorted before serialized |
| `0x11` | Unordered Map | Sorted by key before serialized |
| `0x20` | Ordered Sequence | |
| `0x21` | Ordered Map | |To sort `0x10` and `0x11`, the header is compared first and then the payload:
```clojure
;; Using commas to improve readability
(debris/serialize {true 10 false 20});; 1: [true 10], 2: [false 20]
;; 1: [[0x01,0x01,0x01] [0x02,0x02,0x31,0x30]] 2: [[0x01,0x01,0x00] [0x02,0x02,0x32,0x30]]
;;
;; Final serialization:
;; |---map---|----------false 20---------------|-------------true 10-------------------|
;; [0x11,0x0E,0x01,0x01,0x0,0x02,0x02,0x32,0x30,0x01,0x01,0x01,0x01,0x02,0x02,0x31,0x30]
```