https://github.com/bieli/time-series-data-packer-rs
Time series data packer written in Rust language for data intensive IoT and IIoT projects.
https://github.com/bieli/time-series-data-packer-rs
data-engineering iot measurements optimization packer rust time-series
Last synced: 4 months ago
JSON representation
Time series data packer written in Rust language for data intensive IoT and IIoT projects.
- Host: GitHub
- URL: https://github.com/bieli/time-series-data-packer-rs
- Owner: bieli
- License: mit
- Created: 2025-11-20T22:33:50.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-12-23T14:13:01.000Z (6 months ago)
- Last Synced: 2026-03-04T13:13:43.928Z (4 months ago)
- Topics: data-engineering, iot, measurements, optimization, packer, rust, time-series
- Language: Rust
- Homepage:
- Size: 79.1 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# time-series-data-packer-rs


[](https://crates.io/crates/time-series-data-packer-rs)
Time series data packer written in Rust language for data intensive IoT and IIoT projects.
## Motivations
In a lot of my IoT projects, I have a pressure on storage size after time series data collection with miliseconds precission.
Yes, I've been using Open Sourece great data warehouses, time-series dedicated databases and engines/databases do a lot of work for me, but one day I decided to find better - my own - way to time series data compressions.
This is an experimental project with saving storage size for time series data connected to specific domains (not random data for sure).
## API definitions
- structs:
- `TSPackStrategyType`
- `TSPackSimilarValuesStrategy` - similar values based simple methodology (repeatition of the same values will be packed - default strategy)
- `TSPackMeanStrategy(values_compression_percent: u8)` - mean value based simple methodoloty (for first iteration). `values_compression_percent` parameter value explanations: if we have in time series values i.e. 100, 100, 102, 98, 100, 99, to pack those sieries of values to 100, we need to set this parameter to `5`. Means, we have avg. from series and we try to find, if values are in -5 to 5 range based on avg. value as a reference on values data window.
- `TSPackXorStrategy`
- Packing: First value stored raw. Each subsequent value is XOR’d with the previous value’s bit pattern. The XOR result is stored as an f64 (lossless since we reinterpret bits).
- Unpacking: First value read raw. Each subsequent XOR is applied to reconstruct the original bits.
- `TSPackAttributes`
- `strategy_types: Vec` - what method of compression we would like to use
- `microseconds_time_window: u64` - time window to apply packing strategies in microseconds resolution (sometimes seconds means we have a 100k + similar measurements, so it's good to define more real limits for bare metal and sensor specific criteria for particular data domains)
- `precision_epsilon: f64` - dedicated precision for packing strategies optimization (this is a realistic limits for real/float numbers, when sometimes you no need high-accuracy, but more minimize storage usage strategy)
- `TSSamples`
- f64 - timestamp in seconds
- f64 - real value
- `TSPackedSamples`
- (f64, f64) - timestamps ranges in seconds
- f64 - i.e using mean values strategy (based on `values_compression_percent` parameter setting for `pack` function - in case of used `TSPackMeanStrategy`)
- `TimeSeriesDataPacker` - object with methods:
- `fn pack(samples: Vec, attributes: TSPackAttributes) -> Vec` - packer functon
- `fn unpack( -> Vec) -> (TSPackAttributes, Vec)` - unpacker functon
### API usage example
```rust
use time_series_data_packer_rs::*;
fn main() -> Result<(), Box> {
let samples: Vec = vec![
(0.0, 100.0),
(0.001, 100.0),
(0.002, 102.0),
(0.003, 98.0),
(0.004, 100.0),
(0.005, 99.0),
];
let attrs = TSPackAttributes {
strategy_types: vec![
TSPackStrategyType::TSPackMeanStrategy { values_compression_percent: 5 },
],
microseconds_time_window: 1_000, // 1 ms
};
let mut packer = TimeSeriesDataPacker::new();
let packed = packer.pack(samples.clone(), attrs)?;
println!("Packed: {:?}", packed);
let (_attrs, original) = packer.unpack();
println!("Original recovered: {:?}", original);
println!("Samples == Original recovered: {:?}", samples == original);
Ok(())
}
```
#### Results from API usage example after run
```bash
$ cargo run
Packed: [((0.0, 0.003), 100.0), ((0.004, 0.005), 99.5)]
Original recovered: [(0.0, 100.0), (0.001, 100.0), (0.002, 102.0), (0.003, 98.0), (0.004, 100.0), (0.005, 99.0)]
Samples == Original recovered: true
```
## TODO list
- [X] CI
- [x] crates package distribution
- [ ] I want to have visual UI, to create signals patterns and use it as input to easy wrinting unit tests
- [ ] prepare great real tests examples from various group of IoT sensors
- [ ] compare random sequences compressions rates with real data from IoT sensors
- [ ] add Python interface based on PyO3 library
- [ ] public Python package in TEST and official python packages for inc. popularity
- [ ] measure resources (RAM, CPU, IO) required to pack and unpack data with diffirent time ranges
- [ ] think about packed data buckets concept by time domain: minutes, hours, daily, weekly, monthly (kind of specific packed measurements partitions, becouse in time-series data analytics in IoT there are a very specific situations, when you need to select all historical data, usually you points of interests are selected by known short time ranges)
- [ ] think about lossless data packing algo.