https://github.com/hebirobotics/quickbuffers
Java Protobuf implementation suitable for real-time enviroments
https://github.com/hebirobotics/quickbuffers
java protobuf protocol-buffers
Last synced: 7 months ago
JSON representation
Java Protobuf implementation suitable for real-time enviroments
- Host: GitHub
- URL: https://github.com/hebirobotics/quickbuffers
- Owner: HebiRobotics
- License: apache-2.0
- Created: 2019-08-05T16:50:48.000Z (over 6 years ago)
- Default Branch: main
- Last Pushed: 2024-02-02T11:26:45.000Z (almost 2 years ago)
- Last Synced: 2024-05-10T22:07:50.246Z (over 1 year ago)
- Topics: java, protobuf, protocol-buffers
- Language: Java
- Homepage:
- Size: 2.16 MB
- Stars: 111
- Watchers: 9
- Forks: 10
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# QuickBuffers - Fast Protocol Buffers without Allocations
[](https://github.com/HebiRobotics/QuickBuffers/actions/workflows/maven.yml)
[](https://github.com/HebiRobotics/QuickBuffers/actions/workflows/conformance.yml)
[](https://github.com/HebiRobotics/QuickBuffers/actions/workflows/conformance.yml)
[](https://maven-badges.herokuapp.com/maven-central/us.hebi.quickbuf/quickbuf-runtime)
QuickBuffers is a Java implementation of [Google's Protocol Buffers](https://developers.google.com/protocol-buffers/) that has been developed for low latency use cases in zero-allocation environments. It has no external dependencies, and the API follows Protobuf-Java where feasible to simplify migration.
The main highlights are
* **Allocation-free** in steady state. All parts of the API are mutable and reusable.
* **No reflections**. GraalVM native-images and R8/ProGuard obfuscation ([config](#proguard-configuration)) are supported out of the box
* **Faster** encoding and decoding [speed](./benchmarks)
* **Smaller** code size than protobuf-javalite
* **Built-in JSON** marshalling compliant with the [proto3 mapping](https://developers.google.com/protocol-buffers/docs/proto3#json)
* **Improved order** for optimized [sequential memory access](order.md)
* **Optional accessors** as an opt-in feature (java8)
QuickBuffers passes all [proto2 conformance tests](./conformance) and is compatible with all Java versions from 6 through 20 as well as Android. Proto3 messages can be generated and are wire compatible, but so far the behavioral differences have not been explicitly added due to some [proto3 design decisions](proto3.md) that have kept us from using it. Current limitations include
* [Services](https://developers.google.com/protocol-buffers/docs/proto#services) are not implemented
* [Extensions](https://developers.google.com/protocol-buffers/docs/proto#extensions) are embedded directly into the extended message, so support is limited to generation time.
* The [proto files](https://github.com/protocolbuffers/protobuf/tree/main/src/google/protobuf) for [well-known proto3 types](https://protobuf.dev/reference/protobuf/google.protobuf/) such as `timestamp.proto` or `duration.proto`need to be included manually. Their special cased JSON representations are not implemented.
* Unsigned integer types are JSON encoded as signed integer numbers
## Getting started
In order to use QuickBuffers you need to generate messages and add the corresponding runtime dependency. The runtime can be found at the Maven coordinates below.
```xml
1.4
indent=4,allocation=lazy,extensions=embedded
```
```XML
us.hebi.quickbuf
quickbuf-runtime
${quickbuf.version}
```
The message generator `protoc-gen-quickbuf` is set up as a plugin for the protocol buffers compiler `protoc`. You can install one of the [pre-built packages](https://hebirobotics.github.io/QuickBuffers/download.html) and run:
```sh
protoc-quickbuf --quickbuf_out=${options>:
```
or use a [protoc-gen-quickbuf-${version}-${arch}.exe](https://repo1.maven.org/maven2/us/hebi/quickbuf/protoc-gen-quickbuf/) plugin binary with an absolute `pluginPath`:
```sh
protoc --plugin-protoc-gen-quickbuf=${exePath} --quickbuf_out=${options>:
```
or build messages in Maven using the [protoc-jar-maven-plugin](https://github.com/os72/protoc-jar-maven-plugin):
```xml
com.github.os72
protoc-jar-maven-plugin
3.11.4
generate-sources
run
3.21.12
quickbuf
us.hebi.quickbuf:protoc-gen-quickbuf:${quickbuf.version}
${quickbuf.options}
```
The generator features several options that can be supplied as a comma-separated list. The default values are marked bold.
| Option | Value | Description |
|:-------------------------|:---------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **indent** | **2**, 4, 8, tab | sets the indentation in generated files |
| **replace_package** | (pattern)=replacement | replaces the Java package of the generated messages to avoid name collisions with messages generated by `--java_out`. |
| **input_order** | **quickbuf**, number, none | improves decoding performance when parsing messages that were serialized in a known order. `number` matches protobuf-java, and `none` disables this optimization (not recommended). |
| **output_order** | **quickbuf**, number | `number` matches protobuf-java serialization to pass conformance tests that require binary equivalence (not recommended). |
| **store_unknown_fields** | **false**, true | generates code to retain unknown fields that were encountered during parsing. This allows messages to be routed without losing information, even if the schema is not fully known. Unknown fields are stored in binary form and are ignored in equality checks. |
| **enforce_has_checks** | **false**, true | throws an exception when accessing fields that were not set |
| **allocation** | **eager**, lazy, lazymsg | changes the allocation strategy for nested types. `eager` allocates up-front and results in fewer runtime-allocations, but it may be wasteful and prohibits recursive type declarations. `lazy` waits until the field is actually needed. `lazymsg` acts lazy for nested messages, and eager for everything else. |
| **extensions** | **disabled**, embedded | `embedded` adds extensions from within a single protoc call directly to the extended message. This requires extensions to be known at generation time. Some plugins may do a separate request per file, so it may require an import to combine multiple files. |
| **java8_optional** | **false**, true | creates `tryGet` methods that are short for `return if(hasField()) ? Optional.of(getField()) : Optional.absent()`. Requires a runtime with Java 8 or higher. |
| **gen_descriptors** | **false**, true | creates `descriptor` information for integrating with reflection in existing tools
## Reading and writing messages
We tried to keep the public API as close to Google's `protobuf-java` as possible, so most use cases should require very few changes. The Java related file options are all supported and behave the same way.
```protobuf
// .proto definition
message RootMessage {
optional string text = 1;
optional NestedMessage nested_message = 2;
repeated Person people_list = 3;
}
message NestedMessage {
optional double value = 1;
}
message Person {
optional uint32 id = 1;
optional string name = 2;
}
```
The main difference is that there are no extra builder classes and that all message contents are mutable. The `getMutable()` accessors set the has flag and provide access to the nested references.
```Java
// Use fluent-style to set values
RootMessage msg = RootMessage.newInstance()
.setText("Hello World");
// Use getMutable() to set nested messages
msg.getMutableNestedMessage()
.setValue(1.0);
// Write repeated values into the internally allocated list
RepeatedMessage people = msg.getMutablePeopleList().reserve(4);
for (int i = 0; i < 4; i++) {
Person person = people.next()
.setId(i)
.setName("person " + i);
}
```
Messages can be read from a `ProtoSource` and written to a `ProtoSink`. `newInstance` instantiates optimized implementations for accessing contiguous blocks of memory such as `byte[]` and `ByteBuffer`. Reads and writes do not modify the `ByteBuffer` state, so positions and limits need to be manually if needed.
```Java
// Convenience wrappers
byte[] buffer = msg.toByteArray();
RootMessage result = RootMessage.parseFrom(buffer);
assertEquals(result, msg);
```
The internal state can be reset with the `setInput` and `setOutput` methods. `ProtoMessage::getSerializedSize` sets an internally cached size, so it should always be called before serialization if there were any changes.
```Java
// Reusable objects
byte[] buffer = new byte[512];
ProtoSink sink = ProtoSink.newArraySink();
ProtoSource source = ProtoSource.newArraySource();
// Stream messages
for (int i = 0; i < 100; i++) {
int length = msg.getSerializedSize();
msg.writeTo(sink.setOutput(buffer, 0, length));
result.clearQuick().mergeFrom(source.setInput(buffer, 0, length));
}
```
Additionally, there are also (non-optimized) convenience wrappers for `InputStream`, `OutputStream`, and `ByteBuffer`.
```Java
ProtoSink.newInstance(new ByteArrayOutputStream());
ProtoSource.newInstance(new ByteArrayInputStream(bytes));
```
Keep in mind that mutability comes at the cost of thread-safety, so contents should be cloned with `ProtoMessage::clone` or copied with `ProtoMessage::copyFrom` before being passed to another thread.
**Direct Source/Sink**
Depending on platform support for `sun.misc.Unsafe`, the `DirectSource` and `DirectSink` implementations allow working with off-heap memory. This is intended for reducing unnecessary memory copies when working with direct NIO buffers. Besides not needing to copy data, there is no performance benefit compared to working with heap arrays.
```Java
// Write to direct buffer
ByteBuffer directBuffer = ByteBuffer.allocateDirect(msg.getSerializedSize());
ProtoSink directSink = ProtoSink.newDirectSink();
msg.writeTo(directSink.setOutput(directBuffer));
directBuffer.limit(directSink.getTotalBytesWritten());
// Read from direct buffer
ProtoSource directSource = ProtoSource.newDirectSource();
RootMessage result = RootMessage.parseFrom(directSource.setInput(directBuffer));
assertEquals(msg, result);
```
**JSON Source/Sink**
ProtoMessages also support reading from and writing to JSON as specified in the [proto3 mapping](https://developers.google.com/protocol-buffers/docs/proto3#json).
```Java
// Set some contents
RootMessage msg = RootMessage.newInstance();
msg.setText("👍 QuickBuffers \uD83D\uDC4D");
msg.getMutablePeopleList().next()
.setId(0)
.setName("First Name");
msg.getMutablePeopleList().next()
.setId(1)
.setName("Last Name");
// Print as prettified json
System.out.println(msg);
```
The default toString method for all messages returns prettified json. The above prints:
```text
{
"text": "👍 QuickBuffers 👍",
"peopleList": [{
"id": 0,
"name": "First Name"
}, {
"id": 1,
"name": "Last Name"
}]
}
```
More fine grained control is exposed via the `JsonSink` and `JsonSource` interfaces.
```Java
// json options
JsonSink sink = JsonSink.newInstance()
.setPrettyPrinting(false)
.setWriteEnumsAsInts(false)
.setPreserveProtoFieldNames(false);
// use ProtoMessage::writeTo or JsonSink::writeMessage to serialize the contents
msg.writeTo(sink.clear());
RepeatedByte bytes = sink.getBytes();
// use ProtoMessage::parseFrom or JsonSource::parseMessage to parse the contents
JsonMessage result = JsonSource.newInstance(bytes)
.setIgnoreUnknownFields(true)
.parseMessage(JsonMessage.getFactory());
```
Parts can be combined to convert an incoming protobuf stream to outgoing json and vice-versa
```java
msg.clearQuick()
.mergeFrom(protoSource.setInput(input))
.writeTo(jsonSink.clear());
```
The default implementation encodes the minimal representation accepted by the protobuf spec, i.e., floating point numbers do not append a trailing zero, and long integers are encoded without quotes. Alternative implementations based on GSON and Jackson can be found in the `quickbuf-compat` artifact.
Note that the built-in JsonSink has been optimized quite a bit, but the JsonSource is very barebones due to a lack of an internal use case for JSON decoding.
## Building from source
The project can be built with `mvn package` using jdk 8 through jdk 20.
`mvn clean package --projects protoc-gen-quickbuf,quickbuf-runtime -am` omits building the benchmarks.
Note that the `package` goal is always required, and that `mvn clean test` is not enough to work. This limitation is introduced by the plugin mechanism of `protoc`, which exchanges information with plugins via protobuf messages on `std::in` and `std::out`. Using `std::in` makes it comparatively easy to get schema information, but it is quite difficult to set up unit tests and debug plugins during development. To enable standard tests, the `protoc-gen-request` module contains a tiny protoc-plugin that stores the raw request from `std::in` inside a file that can be loaded during testing and development of the actual generator plugin. This makes the `protoc-gen-quickbuf` module dependent on the packaged output of the `protoc-gen-request` module.
## Detailed accessors for different types
All nested object types such as message or repeated fields have `getField()` and `getMutableField()` accessors. Both return the same internal storage object, but `getField()` should be considered read-only. Once a field is cleared, it should also no longer be modified.
### Primitive fields
All primitive values generate the same accessors and behavior as Protobuf-Java's `Builder` classes
```proto
// .proto
message SimpleMessage {
optional int32 primitive_value = 1;
}
```
```Java
// simplified generated code
public final class SimpleMessage {
public SimpleMessage setPrimitiveValue(int value);
public SimpleMessage clearPrimitiveValue();
public boolean hasPrimitiveValue();
public int getPrimitiveValue();
private int primitiveValue;
}
```
### Message fields
Nested message types are allocated internally. The recommended way to set nested message content is by accessing the internal store with `getMutableNestedMessage()`. Setting content using `setNestedMessage(NestedMessage.newInstance())` copies the data, but does not change the internal reference.
```proto
// .proto
message NestedMessage {
optional int32 primitive_value = 1;
}
message RootMessage {
optional NestedMessage nested_message = 1;
}
```
```Java
// simplified generated code
public final class RootMessage {
public RootMessage setNestedMessage(NestedMessage value); // copies contents to internal message
public RootMessage clearNestedMessage(); // clears has bit as well as the backing object
public boolean hasNestedMessage();
public NestedMessage getNestedMessage(); // internal message -> treat as read-only
public NestedMessage getMutableNestedMessage(); // internal message -> may be modified until has state is cleared
private final NestedMessage nestedMessage = NestedMessage.newInstance();
}
```
```Java
// (1) setting nested values via 'set' (does a data copy!)
msg.setNestedMessage(NestedMessage().newInstance().setPrimitiveValue(0));
// (2) modify the internal store directly (recommended)
RootMessage msg = RootMessage.newInstance();
msg.getMutableNestedMessage().setPrimitiveValue(0);
```
### String fields
`String` types are internally stored as `Utf8String` that are lazily parsed and can be set with `CharSequence`. Since Java `String` objects are immutable, there are additional access methods to allow for decoding characters into a reusable `StringBuilder` instance, as well as for using a custom `Utf8Decoder` that can implement interning.
```proto
// .proto
message SimpleMessage {
optional string optional_string = 2;
}
```
```Java
// simplified generated code
public final class SimpleMessage {
public SimpleMessage setOptionalString(Utf8String value);
public SimpleMessage setOptionalString(CharSequence value);
public SimpleMessage clearOptionalString(); // sets length = 0
public boolean hasOptionalString();
public String getOptionalString(); // lazily converted string
public Utf8String getOptionalStringBytes(); // internal representation -> treat as read-only
public Utf8String getMutableOptionalStringBytes(); // internal representation -> may be modified until has state is cleared
private final Utf8String optionalString = Utf8String.newEmptyInstance();
}
```
```Java
// Get characters
SimpleMessage msg = SimpleMessage.newInstance().setOptionalString("my-text");
StringBuilder chars = new StringBuilder();
msg.getOptionalStringBytes().getChars(chars); // chars now contains "my-text"
```
### Repeated fields
Repeated scalar fields work mostly the same as String fields, but the internal `array()` can be accessed directly if needed. Repeated messages and object types provide a `next()` method that adds one element and provides a mutable reference to it.
```proto
// .proto
message SimpleMessage {
repeated double repeated_double = 42;
}
```
```Java
// simplified generated code
public final class SimpleMessage {
public SimpleMessage addRepeatedDouble(double value); // adds one value
public SimpleMessage addAllRepeatedDouble(double... values); // adds N values
public SimpleMessage clearRepeatedDouble(); // sets length = 0
public boolean hasRepeatedDouble();
public RepeatedDouble getRepeatedDouble(); // internal store -> treat as read-only
public RepeatedDouble getMutableRepeatedDouble(); // internal store -> may be modified
private final RepeatedDouble repeatedDouble = RepeatedDouble.newEmptyInstance();
}
```
## Proguard configuration
There are no reflections, so none of the fields need to be preserved or special cased. However, Proguard may warn about missing methods when obfuscating against an older runtime. This is related to an intentional workaround, so the warnings can just be disabled by adding the line below to the `proguard.conf` file. R8 should automatically pick it up from the bundled [config file](./runtime/src/main/resources/META-INF/proguard/quickbuf-runtime.pro).
```text
-dontwarn us.hebi.quickbuf.JdkMethods
```
## Acknowledgements
Many internals and large parts of the generated API are based on [Protobuf-Java](https://github.com/protocolbuffers/protobuf). The encoding of floating point numbers during JSON serialization is based on [Schubfach](https://github.com/c4f7fcce9cb06515/Schubfach/) [[Giu2020](https://drive.google.com/open?id=1luHhyQF9zKlM8yJ1nebU0OgVYhfC6CBN)]. Many other JSON parts were inspired by [dsl-json](https://github.com/ngs-doo/dsl-json), [jsoniter](https://jsoniter.com/), and [jsoniter-scala](https://github.com/plokhotnyuk/jsoniter-scala).