Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cboudereau/dataseries
Functions for dataseries / timeseries
https://github.com/cboudereau/dataseries
data-series dataseries java rust time-series timeseries union
Last synced: 2 months ago
JSON representation
Functions for dataseries / timeseries
- Host: GitHub
- URL: https://github.com/cboudereau/dataseries
- Owner: cboudereau
- License: mit
- Created: 2023-07-21T12:34:58.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-16T09:57:48.000Z (4 months ago)
- Last Synced: 2024-09-16T11:13:15.625Z (4 months ago)
- Topics: data-series, dataseries, java, rust, time-series, timeseries, union
- Language: Java
- Homepage:
- Size: 197 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# dataseries
[![License:MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)data-series functions support for data-series and time-series.
## rust
[![build](https://github.com/cboudereau/dataseries/workflows/build-rs/badge.svg?branch=main&event=push)](https://github.com/cboudereau/dataseries/actions/workflows/build-rs.yml?query=event%3Apush+branch%3Amain)
[![codecov](https://codecov.io/gh/cboudereau/dataseries/branch/main/graph/badge.svg?token=UFSTKQG9FY&flag=rust)](https://app.codecov.io/gh/cboudereau/dataseries/tree/main/rust)
[![docs.rs](https://docs.rs/dataseries/badge.svg)](https://docs.rs/dataseries)
[![crates.io](https://img.shields.io/crates/v/dataseries.svg)](https://crates.io/crates/dataseries)
[![crates.io (recent)](https://img.shields.io/crates/dr/dataseries)](https://crates.io/crates/dataseries)## java
[![build](https://github.com/cboudereau/dataseries/workflows/build-java/badge.svg?branch=main&event=push)](https://github.com/cboudereau/dataseries/actions/workflows/build-java.yml?query=event%3Apush+branch%3Amain)
[![codecov](https://codecov.io/gh/cboudereau/dataseries/branch/main/graph/badge.svg?token=UFSTKQG9FY&flag=java)](https://app.codecov.io/gh/cboudereau/dataseries/tree/main/java)
[![maven central](https://img.shields.io/maven-central/v/io.github.cboudereau.dataseries/dataseries.svg)](https://search.maven.org/artifact/io.github.cboudereau.dataseries/dataseries/)
[![javadoc](https://www.javadoc.io/badge/io.github.cboudereau.dataseries/dataseries.svg)](https://www.javadoc.io/doc/io.github.cboudereau.dataseries/dataseries)## functions
### union
Continuous time series union between 2 series. Left and right data can be absent (left and right only cases).
```
1 3 10 20
Left: |-----|-----|------------------|-
130 120 95 160
12 15
Right: |------|--------
105 110
1 3 10 12 15 20
Expected: |-----|-----|----|------|------|-
130,∅ 120,∅ 95,∅ 95,105 95,110 160,110```
## eventual consistency and conflict resolution for data-series
The ```crdt``` example provides an example of the [conflict-free replicated data type](https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type) resolution based on data-series ```union``` function and ``VersionedValue`` type to solve conflict with a timestamp (any variable supporting [partially ordered set](https://en.wikipedia.org/wiki/Partially_ordered_set)) for rust and java.## trade-offs
### interval representation
```
Half-open interval Data-series (time-series/gauge)
(n value and 2n points/delta) (n+1 datapoints)
\
1 3 5 \ 1 3 5 +∞
[----[----[ \ |----|----|------------
100 120 / 100 120 ∅
/
/
```
||pros|cons|
|-|-|-|
|half-open interval|+same read and write model|-illegal state representation
-requires global secondary index to support range queries|
|data-series/time-series|+nosql and TSDB friendly
+less illegal states
+compatibility with time/data-series functions and visualization
+compact format (requires only n+1 datapoint intead of 2n))
+partitionning is trivial (only one dimension)
+updates are less complex (no need to update impacted points of interval)|-read and write models are different
-hole should be represented with a datapoint ```None``` value|An interval can be defined by using 2 points (upper and lower bound) with an associated value but it can be difficult to index those 2 points in nosql databases (Global secondary index) or simply using a TSDB (timeseries database).
Another approach consists of defining an intermediate model, a data-series with only one point and one value so that the datapoint fits really well with TSDB and is algorithm friendly.
It becomes also easy to avoid unwanted states; an interval can be defined with 2 points and the last point can be before the first one which is a bug in the domain. You can also define a point and a non negative offset which can work but requires more code.
For database support, interval reading requires 2 reads to compute the interval but the extra read can be hidden in easily in an Iterator.
## implementation
rust and java implementation are provided in respective directories.