Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gregros/preszr
A lightweight library for encoding complex objects so they can be serialized.
https://github.com/gregros/preszr
json serialization serializer
Last synced: about 2 months ago
JSON representation
A lightweight library for encoding complex objects so they can be serialized.
- Host: GitHub
- URL: https://github.com/gregros/preszr
- Owner: GregRos
- License: mit
- Created: 2021-06-29T13:33:02.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2024-09-04T20:11:08.000Z (4 months ago)
- Last Synced: 2024-11-03T22:31:21.589Z (2 months ago)
- Topics: json, serialization, serializer
- Language: TypeScript
- Homepage:
- Size: 3.23 MB
- Stars: 3
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
- Support: docs/supported.md
Awesome Lists containing this project
README
# preszr
[![Node.js CI](https://github.com/GregRos/preszr/actions/workflows/main.yaml/badge.svg)](https://github.com/GregRos/preszr/actions/workflows/main.yaml)
[![Coverage Status](https://coveralls.io/repos/github/GregRos/preszr/badge.svg?branch=master)](https://coveralls.io/github/GregRos/preszr?branch=master)
[![npm](https://img.shields.io/npm/v/preszr)](https://www.npmjs.com/package/preszr)`preszr` is a schema-less _pre-serialization_ JavaScript library that lets you shove arbitrary JavaScript objects through the pipes.
Here's how you use it:
1. Encode any value with `preszr`. You get a flat, JSON-legal value that's usually an array or sometimes a primitive.
2. You can take it and _serialize_ it with [`JSON.stringify`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify) or anything else that can serialize JS objects, like [BSON](https://www.mongodb.com/basics/bson).
3. You send it via the network or maybe save it to file.
4. At the other end (in time or space), you first _deserialize_ the data using `JSON.parse` or whatever you used.
5. Then you _decode_ it using `preszr`.
6. Now you have your thing back, with all its references and prototypes and everything (if any).`preszr` uses a strict, well-defined, and extensible format that can encode any JavaScript object.
## Features
🔗 Preserves references and prototypes!
🐐 Supports all built-in data types and values as of 2023!1
🛠️ Super easy to encode custom types, with several layers of customization!
🌍 Written in vanilla JS to work in all environments.
## Usage
`preszr` exports three things:
```javascript
import { encode, decode, Preszr } from "preszr"
````encode` will encode your thing into a _preszr message_:
```javascript
const yourThing = {
a: new Uint16Array([1, 2, 3]),
b: undefined,
c: []
}
const encoded = encode({
a: new Uint16Array([1, 2, 3])
})// Serializing the message
const serialized = JSON.stringify(encoded)// Sending it
websocket.send(serialized)
````decode` will do the opposite:
```javascript
websocket.on("message", event => {
const data = event.data// Deserializing
const deserialized = JSON.parse(data)// Decoding
const decoded = decode(deserialized)// Got your thing back!
expect(decoded).toEqual(yourThing)
})
```The default functions will work for most objects - for example:
- `Date`
- `ArrayBuffer`
- `BigInt64Array` (if exists)Or, more generally, any platform-independent, built-in object that's not [explicitly unsupported](docs/supported.md).
What about other object types, though? Let's take a real-world example.
```javascript
class UhhNumber {
constructor(_value) {
this._value = value
}plus(other) {
return new UhhNumber(this._value + other._value)
}valueOf() {
return this._value
}get value() {
return this._Value
}
}
```Say you wanted `preszr` to encode one of those. You just create a new `Preszr` instance and give it a config object like this:
```javascript
// 'new' is actually optional.
const prz = new Preszr({
encodes: [UhhNumber]
})
```And that's it. `myInstance` can now encode an `UhhNumber`! Here is an example:
```javascript
const recoded = prz.decode(prz.encode(new UhhNumber(5)))expect(recoded).toBeInstanceOf(UhhNumber)
expect(recoded.value).toBe(5)
expect(recoded.plus(recoded)).toEqual(new UhhNumber(10))
```The `Preszr` object is immutable and its configuration can't be modified later.
## Installing
```bash
npm install preszr
```Or:
```bash
yarn add preszr
```# Library versioning
`preszr` follows [semver](https://semver.org/), and changes in the format of a _preszr message_ will always increment the major version.
To ensure it doesn't decode data incorrectly, `preszr` injects its major version into non-trivial _preszr messages_. `preszr` will use that version number to determine whether it can decode the message. Right now, `preszr` will error unless that number is the same, but in the future, it might have some fallback.
This is one of the features that allow you to safely write preszr messages to disk.
# Customization
To encode objects, `preszr` uses objects called _encodings_. Here is what they look like:
```javascript
{
// The thing this encoding is for.
// A constructor or prototype.
encodes: UhhNumber// A name. Can usually be inferred.
name: "NumberOrSomething"// A version. Defaults to 1. We'll talk about these later.
version: 1,// Encoding logic.
encodes(/* LATER */) { /* LATER */ },// Decoding logic.
decoder: {
create(/* LATER */) { /* LATER */ },
init(/* LATER */) { /* LATER */ }
}
}
```When encoding, `preszr` will use the encoding of the nearest prototype of an object that it knows, possibly down to `Object.prototype` if it doesn't find anything else. You can't have two encodings for the same prototype unless they're versioned (but we'll talk about versioning later). The same is true for encodings with the same name.
These objects go into the `encodes` property of the `Preszr` configuration object. The order doesn't matter. When you put a constructor in there, like in the Usage section, preszr will generate a complete encoding behind the scenes, inferring or using defaults for everything except the `encodes` property (which is required).
In some cases, like if `preszr` fails to infer the constructor name or there is a collision, you'll need to supply a basic encoding object. It can just include the `name` and `encodes` properties, though:
```typescript
const NamelessClass = (class {});// throws PreszrError(config/spec/proto/no-name):
// Couldn't get the prototype's name. Add a 'name' property
let preszr = Preszr({
encodes: [NamelessClass]
});// Doesn't throw
preszr = Preszr({
encodes: [{
encodes: NamelessClass
name: "NamelessClass"
}]
})
```The default encoding logic is pretty dumb, but it will work for most normal objects. Here is what it does:
- When encoding, it will copy the object key by key1 and recursively encode each value.
- When decoding, it will create a new object with the right prototype and then copy the input key by key1, decoding its values.1 Inherited or non-enumerable keys are not copied. Symbol keys are copied though, if they are enumerable.
Some objects can't be encoded like this. For example, `Map` and `Set` use hidden internal data. Some of your objects might too. They also might depend on variables bound to closures or on the phase of the moon.
For those objects, you'll need to write custom encoding logic. That's the `encodes` function and the `decoder` object. While you can have neither and just use the default behavior if you specify one of them, you have to also specify the other.
## Encoding
Encoding is done by a single function that looks like this:
```javascript
function encodes(uhhNumber, ctx) {
return uhhNumber.value
}
```This function takes the input (here called `uhhNumber`) and returns a representation of it that consists of only structural data and JSON-legal values. The representation can be anything you choose - an object, a number, a string, and so on. So the following are all valid representations:
- `"abcd"`
- `null`
- `1500`
- `{ value: 10}`
- `[1]`
- `[[[[[10]]]]]`The following are not:
- `new Uint8Array()`
- `function () {console.log("hello world")}`
- [`100n`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt#description)
- `undefined`
- [`document.all`](https://developer.mozilla.org/en-US/docs/Web/API/Document/all)Preszr won't check your results, though (for performance reasons), and if you return anything weird, it can lead to _undefined behavior_.
Each `encode` function is supposed to only ever encode a single prototype. When it stumbles on some internal value (that of a property, an element of a collection, or something else), it can give control back so `preszr` can handle encoding it. It does that using the `ctx.encode` method.
`ctx.encode` differs from the previous functions called `encode` we discussed. It's only for encoding the internals of an object - it doesn't return a _preszr message_ or anything like that.
Instead, it will always return a JSON-legal primitive that you just need to plug in the right place - either the value itself, an encoded string, or a reference (which is also a string).
At any rate, it makes writing an encoding for something extremely simple. Even for a complex object, you just need to figure out how to represent the structure of its internal values.
Let's look at the encoding of [Set](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Set) as an example:
```javascript
function encode(set, ctx) {
const result = []
for (const item of set) {
result.push(
// We don't need to worry about encoding the elements,
// and just let preszr handle it.
ctx.encode(item)
)
}
return result
}
```It's usually that simple.
`ctx.encode` can recurse - if it needs to encode a value of the type you're encoding down the line, it will call your `encode` function again.
Calls to `encode` are generally hard to predict, so it's important to make sure your `encode` function doesn't cause side effects or have an internal state.
## Decoding
Decoding is a two-step process:
1. You first **CREATE** the object based on its _preszr representation_, without decoding any internal data.
2. Then you **INIT** it by decoding the internal data and putting it where it belongs.A _decoder object_ is just an object with those functions. However, both are optional.
```javascript
const decoder = {
create(input) {},
init(input, ctx) {}
}
```### CREATE
```typescript
// CREATE stage for Set:
function create(encodedInput) {
return new Set()
}
```During this stage, you need to return an instance with the correct prototype.
However, you can't decode any internal data that was encoded using `ctx.encode`, since other objects might not have had their **CREATE** step execute, so there is nothing `ctx.encode` can return.
The result of this stage will be an empty, uninitialized object. It'll have fields that are `undefined`, methods that don't work, etc.
On the other hand, if you don't need to decode internal data (e.g. the object can't have references to other objects), you don't need the next stage at all, and just this function will be enough. One example is `ArrayBuffer`, which is just encoded as a base64 string.
```javascript
const decoder = {
create(base64) {
return base64ToArrayBuffer(base64)
}
}
```**If you don't have a `create` method, the object will be created using [`Object.create`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/create) and left blank. This will often be enough.**
### INIT
```javascript
// INIT stage for Set:
function init(target, encodedInput, ctx) {
// We chose our `input` to be an array.
for (const element of encodedInput) {
target.push(
ctx.decode(element);
)
}
}
```During this stage, we initialize the object that was created in the **CREATE** stage (which we get through the `target` parameter). This time, we can use the `ctx` we get to decode internal data, much in the same way we did when we were encoding.
`ctx.decode` lets us decode an encoded value. Its input is the output of the `ctx.encode` function we used while encoding. The result will be the proper, decoded form of what you gave it. It will resolve references, decode encoded strings, or (if was JSON-legal in the first place) just return the value is it is.
Unlike `ctx.encode`, `ctx.decode` is not recursive. Objects you get from it will have had their **CREATE** stage execute, but not their **INIT** stage, so they might have properties that are `undefined ` and methods that don't work.
**Your `init` function should not return anything and its return value will be ignored.**
**If you don't have an `init` function, this stage will do nothing.**
### Empty decoders
You can have an empty decoder object, `{}`. An empty decoder object will create the object using `Object.create` and not initialize anything. I don't know why you'd want to do that though. Maybe if you're just testing the encoding part.
## Symbols
`preszr` can also deal with [JavaScript symbols](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol), both as values and as object keys. Just like with prototypes, `preszr` knows about all the [built-in symbols](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol#static_properties) (though most are not relevant to data), but you'll need to tell it about your custom symbols so it can reproduce them.
To do that, you can just put them in the `encodes` array of the `Preszr` configuration:
```javascript
const mySecretSymbol = Symbol("A secret symbol")
const przWithSymbols = new Preszr({
encodes: [mySecretSymbol]
})
```This creates a _symbol encoding_, a different type of encoding than we talked about earlier (which was actually a _prototype encoding_). Symbol encodings don't have the features of prototype encodings. Here is what a complete symbol encoding looks like:
```javascript
const przWithSymbols = new Preszr({
encodes: [{
encodes: mySecretSymbol,
name: "MySecretSymbol"
}]
)
const symbolEncoding =
```Like with prototype encodings, `encodes` is required but `name` can be omitted - that's what happens if you just provide the symbol. In that case, the name is inferred from the symbol description. If the symbol doesn't have a description or if there is a collision, you'll need to provide the property after all.
Symbol encodings go into the same `encodes` array as other encodings.
If `preszr` encounters a symbol it doesn't recognize, it won't ignore it or error. It will instead replace all of its appearances with a stand-in symbol. This is similar to what it does with [unsupported values](docs/supported.md). The symbol's description will be similar to:
```
preszr unrecognized ${description}
```# Versioning
`preszr`'s internal versioning system for custom encodings has been designed to handle two specific use-cases:
1. Reading legacy data, such as data that was written to disk before a change in your objects was made.
2. Overriding built-in encodings, such as for `Set` and the like.But first, let's look at how `preszr` manages these versions in general.
### Versions
`preszr` identifies each encoding with an encoding key. For prototype encodings, this key will be a combination of the _name_ and _version_ of the encoding:
```javascript
;`${ENCODING_NAME}.v${ENCODING_VERSION}`
```The rule is that two encodings with the same _key_ can't exist on a `Preszr` object.
- For built-in encodings, the version is always `0`. Instead of using this system, **any change in a built-in encoding will cause a change in the library’s major version.**
- For user-defined encodings, the version defaults to `1`, but can be any positive, [safe](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/isSafeInteger#:~:text=A%20safe%20integer%20is%20an,fit%20the%20IEEE%2D754%20representation.) integer.Another rule is that two encodings with the same name must encode the same prototype.
When `preszr` receives a value to encode, it will encode it using the encoding with the highest version, noting that version down in the _preszr message_. It won't do this when decoding, though. Instead, `preszr` will try to find a decoder that matches the encoding key exactly. If it can't find, it will throw an error.
This follows the [robustness principle](https://en.wikipedia.org/wiki/Robustness_principle):
> Be conservative in what you send, be liberal in what you accept
It means that you'll be able to read legacy data, but never write it, and updating it is as easy as decoding and then re-encoding it.
When you version an encoding, there are a few more restrictions you have to follow:
- If you have a versioned encoding, each instance must have the `version` property - it will never be inferred. This makes it clear that your encoding is versioned.
- You must also specify the `name` property. This is because a change in your object can cause the inferred name to change.Versioned encodings still go into the `encodes` list of encodings, in any order, as separate objects. Grouping them together doesn't do anything.
### How a version change would work
When you want to make a change to how your object is encoded while still being able to read old data, you need to do the following:
1. Create a new version of your encoding. Your logic can be as similar or different as you want.
2. If, after the modification, your object needs new or different data, modify each previous version of the encoding you want to support so that it returns a compatible object.## Overriding a built-in encoding
Versions work the same way for built-in encodings, except that you're not allowed to set their `name` property when you define an override - they're identified uniquely by their prototype, and it would just be confusing.
Built-in encodings will always have the version `0`, so you can start your versions from `1`, but `version` is still required.