Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jsuereth/sauerkraut
A reimagined scala-pickling in the Scala 3 world
https://github.com/jsuereth/sauerkraut
picklers scala serialization-framework
Last synced: 3 months ago
JSON representation
A reimagined scala-pickling in the Scala 3 world
- Host: GitHub
- URL: https://github.com/jsuereth/sauerkraut
- Owner: jsuereth
- License: apache-2.0
- Created: 2020-02-13T12:00:53.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2023-05-20T15:14:22.000Z (over 1 year ago)
- Last Synced: 2024-10-14T17:17:48.092Z (3 months ago)
- Topics: picklers, scala, serialization-framework
- Language: Scala
- Size: 323 KB
- Stars: 73
- Watchers: 9
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.md
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# Sauerkraut
The library for those cabbage lovers out there who want
to send data over the wire.A revitalization of [Pickling](http://github.com/scala/pickling) in the
[Scala 3](https://dotty.epfl.ch/docs/index.html) world.## Usage
When defining over-the-wire messages, do this:
```scala
import sauerkraut.core.{Buildable,Writer,given}
case class MyMessage(field: String, data: Int)
derives Buildable, Writer
```Then, when you need to serialize, pick a format and go:
```scala
import format.json.{Json,given}
import sauerkraut.{pickle,read,write}val out = StringWriter()
pickle(Json).to(out).write(MyMessage("test", 1))
println(out.toString())val msg = pickle(Json).from(out.toString()).read[MyMessage]
```# Current Formats
Here's a feature matrix for each format:
| Format | Reader | Writer | All Types | Evolution Friendly | Notes |
| ------ | ------ | ------ | --------- | ------------------ | ---------------------------------------- |
| Json | Yes | Yes | Yes | Yes | Uses Jawn for parsing |
| Protos | Yes | Yes | Yes | Yes | Binary format evolution friendly format |
| NBT | Yes | Yes | Yes | | For the kids. |
| XML | Yes | Yes | Yes | | Inefficient prototype. |
| Pretty | No | Yes | No | | For pretty-printing strings |See [Compliance](compliance/README.md) for more details on what this means.
## Json
Everyone's favorite non-YAML web data transfer format! This uses Jawn under the covers for parsing, but
can write Json without any dependencies.Example:
```scala
import sauerkraut.{pickle,read,write}
import sauerkraut.core.{Buildable,Writer, given}
import sauerkraut.format.json.Jsoncase class MyWebData(value: Int, someStuff: Array[String])
derives Buildable, Writerdef read(in: java.io.InputStream): MyWebData =
pickle(Json).from(in).read[MyWebData]
def write(out: java.io.OutputStream): Unit =
pickle(Json).to(out).write(MyWebData(1214, Array("this", "is", "a", "test")))
```sbt build:
```scala
libraryDependencies += "com.jsuereth.sauerkraut" %% "json" % ""
```See [json project](json/README.md) for more information.
## Protos
A new encoding for protocol buffers within Scala! This supports a subset of all possible protocol buffer messages
but allows full definition of the message format within your Scala code.Example:
```scala
import sauerkraut.{pickle,write,read, Field}
import sauerkraut.core.{Writer, Buildable, given}
import sauerkraut.format.pb.{Proto,,given}case class MyMessageData(value: Int @Field(3), someStuff: Array[String] @Field(2))
derives Writer, Buildabledef write(out: java.io.OutputStream): Unit =
pickle(Proto).to(out).write(MyMessageData(1214, Array("this", "is", "a", "test")))
```This example serializes to the equivalent of the following protocol buffer message:
```proto
message MyMessageData {
int32 value = 3;
repeated string someStuff = 2;
}
```sbt build:
```scala
libraryDependencies += "com.jsuereth.sauerkraut" %% "pb" % ""
```See [pb project](pb/README.md) for more information.
# NBT
Named-Binary-Tags, a format popularized by Minecraft.Example:
```scala
import sauerkraut.{pickle,read,write}
import sauerkraut.core.{Buildable,Writer, given}
import sauerkraut.format.nbt.Nbtcase class MyGameData(value: Int, someStuff: Array[String])
derives Buildable, Writerdef read(in: java.io.InputStream): MyGameData =
pickle(Nbt).from(in).read[MyGameData]
def write(out: java.io.OutputStream): Unit =
pickle(Nbt).to(out).write(MyGameData(1214, Array("this", "is", "a", "test")))
```sbt build:
```scala
libraryDependencies += "com.jsuereth.sauerkraut" %% "nbt" % ""
```See [nbt project](nbt/README.md) for more information.
# XML
Everyone's favorite markup language for data transfer!Example:
```scala
import sauerkraut.{pickle,read,write}
import sauerkraut.core.{Buildable,Writer, given}
import sauerkraut.format.xml.{Xml, given}case class MySlowWebData(value: Int, someStuff: Array[String])
derives Buildable, Writerdef read(in: java.io.InputStream): MySlowWebData =
pickle(Xml).from(in).read[MySlowWebData]
def write(out: java.io.Writer): Unit =
pickle(Xml).to(out).write(MySlowWebData(1214, Array("this", "is", "a", "test")))
```sbt build:
```scala
libraryDependencies += "com.jsuereth.sauerkraut" %% "xml" % ""
```See [xml project](xml/README.md) for more information.
# Pretty
A format that is solely used to pretty-print object contents to strings. This does not have
a [PickleReader] only a [PickleWriter].Example:
```scala
import sauerkraut._, sauerkraut.core.{Writer,given}
case class MyAwesomeData(theBest: Int, theCoolest: String) derives Writerscala> MyAwesomeData(1, "The Greatest").prettyPrint
val res0: String = Struct(rs$line$2.MyAwesomeData) {
theBest: 1
theCoolest: The Greatest
}
```# Design
We split Serialization into three layers:
1. The `source` layer. It is expected these are some kind of stream.
2. The `Format` layer. This is responsible for reading a raw source and converting into
the component types used in the `Shape` layer. See `PickleReader` and `PickleWriter`.
3. The `Shape` layer. This is responsible for turning Primitives, Structs, Choices and Collections
into component types.It's the circle of data:
```
Source => format => shape => memory => shape => format => Destination[PickleData] => PickleReader => Builder[T] => T => Writer[T] => PickleWriter => [PickleData]
```This, hopefully, means we can reuse a lot of logic betwen various formats with light loss to efficiency.
*Note: This library is not measuring performance yet.*
### Shape layer
The Shape layer is responsible for extracting Scala types into known shapes that can be used for
serialization. These shapes, current, are `Collection`, `Structure` and `Primitive`. Custom
shapes can be created in terms of these three shapes.The Shape layer defines these three classes:
- `sauerkraut.core.Writer[T]`:
Can translate a value into write* calls of Primitive, Structure or Collection.
- `sauerkraut.core.Builder[T]`:
Can accept an incomiing stream of collections/structures/primitives and build a value of T from them.
- `sauerkraut.core.Buildable[T]`:
Can provide a `Builder[T]` when asked.### Format layer
The format layer is responsible for mapping sauerkraut shapes (`Collection`, `Structure`, `Primitive`, `Choice`) into
the underlying format. Not all shapes in sauerkraut will map exactly to underlying formats, and so each
format may need to adjust/tweak incoming data as appropriate.The format layer has these primary classes:
- `sauerkraut.format.PickleReader`: Can load data and push it into a Builder of type T
- `sauerkraut.format.PickleWriter`: Accepts pushed structures/collections/primitives and places it into a Pickle### Source Layer
The `source` layer is allowed to be any type that a format wishes to support. Inputs and outputs are
provided to the API via these two classes:- `sauerkraut.format.PickleReaderSupport[Input, Format]`:
A given of this instance will allow the `PickleReader` to be constructed from a type of input.
- `sauerkraut.format.PickleWriterSupport[Output,Format]`:
A given of this instance will allow `PickleWriter` to be constructed from a type of output.This layer is designed to support any type of input and output, not just an in-memory store (like a Json Ast) or
a streaming input. Formats can define what types of input/output (or execution environment) they allow.## Writing a new format.
New formats are expected to provide the "format" + "source" layer implementations they require.
TODO - a bit more here.
# Differences from Scala Pickling
There are a few major differences from the old [scala pickling project](http://github.com/scala/pickling).
- The core library is built for 100% static code generation. While we think that dynamic (i.e. runtime-reflection-based)
pickling could be built using this library, it is a non-goal.
- Users are expected to rely on typeclass derivation to generate Reader/Writers, rather than using macros
- The supported types that can be pickled are limited to the same supported by typeclass derivation or that
can have hand-written `Writer[_]`/`Builder[_]` instances.
- Readers are no longer driven by the Scala type. Instead we use a new `Buildable[A]`/`Builder[A}` design
to allow each `PickleReader` to push value into a `Builder[A]` that will then construct the scala class.
- There have been no runtime performance optimisations around codegen. Those will come as we test the
limits of Scala 3 / Dotty.
- Format implementations are separate libraries.
- The `PickleWriter` contract has been split into several types to avoid misuse. This places a heavier amount
of lambdas in play, but may be offsite with optimisations in modern versions of Scala/JVM.
- The name is more German.# Benchmarking
Benchmarking is still being built-out, and is pending the final design on Choice/Sum-Types within the Format/Shape layer.
You can see benchmark results via: ` benchmarks/jmh:run -rf csv`.
Latest status/analysis can be found in the [benchmarks directory](benchmarks/latest-results.md).
## Benchmarking TODOs
- [X] Basic comparison of all formats
- [X] Size-of-Pickle measurement
- [ ] Well-thought out dataset for reading/writing
- [X] Isolated read vs. write testing
- [ ] Comparison against other frameworks.
- [X] Protos vs. protocol buffer java implementation
- [ ] Json Reading vs. raw JAWN to AST (measure overhead)
- [ ] Jackson
- [X] Kryo
- [ ] Thrift
- [ ] Circe
- [X] uPickle
- [ ] Automatic well-formatted graph dump in Markdown of results.# Thanks
Thanks to everyone who contributed to the original pickling library for inspiration, with a few callouts.
- Heather Miller + Philipp Haller for the original idea, innovation and motivation for Scala.
- Havoc Pennington + Eugene Yokota for helping define what's important when pickling a protocol and evolving that protocol.