https://github.com/trumank/ser-hex
Serialization tracing and visualization tools
https://github.com/trumank/ser-hex
instrumentation reverse-engineering serialization tracing
Last synced: about 2 months ago
JSON representation
Serialization tracing and visualization tools
- Host: GitHub
- URL: https://github.com/trumank/ser-hex
- Owner: trumank
- License: mit
- Created: 2024-01-25T01:54:51.000Z (about 2 years ago)
- Default Branch: master
- Last Pushed: 2025-06-12T04:03:05.000Z (10 months ago)
- Last Synced: 2025-06-12T05:23:28.459Z (10 months ago)
- Topics: instrumentation, reverse-engineering, serialization, tracing
- Language: Rust
- Homepage:
- Size: 331 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ser-hex
Serialization tracing and visualization tools.
Attempts to answer the question "where did these bytes come from??" when
examining opaque binary blobs of data.
## ser-hex-tui
```console
cargo run --release --bin ser-hex-tui examples/bson/trace.json
```

## trace format
The trace output contains the binary data and a tree of stream actions (Read/Seek/Span).
```json
{
"data": "DAAAAEhlbGxvIFdvcmxkIQ==",
"start_index": 0,
"root": {
"Span": {
"name": "pascal_string",
"actions": [
{
"Span": {
"name": "length",
"actions": [
{
"Read": 4
}
]
}
},
{
"Read": 12
}
]
}
}
}
```
## capturing a trace
There are two methods of capturing trace data from rust:
### rust Tracing instrumentation
Builds spans by listening to `tracing::instrument`'d functions. This results
in accurately nested spans but requires manual annotation of functions.
```rust
let mut input = Cursor::new([1, 2, 3]);
ser_hex::read_incremental("trace.json", &mut input, read)?;
#[tracing::instrument(skip_all)]
fn read(input: &mut R) -> std::io::Result<()> {
input.read_exact(&mut [0; 3])?;
Ok(())
}
```
### backtrace captures
The second option is to construct spans from backtrace captures. This does not
require sprinkling instrumentation annotations all over the serialization code,
but can result in lower quality trace data. Since backtraces are captured only
on read/seek events, it's impossible to know how far up the stack control flow
went between reads, which can lead to inaccurately reconstructed spans.
In theory, if stack frame push/pop events could be hooked at the hardware
level or via [emulation](https://www.unicorn-engine.org/), they could be made
accurate, but I have not explored this route yet. Still with this limitation, it
provides useful data with little effort.
```rust
let mut input = Cursor::new([1, 2, 3]);
let mut tracer = ser_hex_tracer::TracerReader::new_options(
&mut input,
ser_hex_tracer::TracerOptions { skip_frames: 3 }, // number of top level stack frames to omit from trace
);
let res = read(&mut tracer);
tracer.trace().save("trace.json").unwrap();
fn read(input: &mut R) -> std::io::Result<()> {
input.read_exact(&mut [0; 3])?;
Ok(())
}
```
### tracing other streams or non-rust code
It is possible to trace arbitrary native code by hooking the necessary stream
functions and calling the corresponding functions on your `Tracer` object. See
[trace_factorio](examples/trace_factorio) and [trace_drg](examples/trace_drg)
for examples of tracing data streams implemented in C++.