https://github.com/paradedb/decimal-bytes

Arbitrary precision decimals with lexicographically sortable byte encoding
https://github.com/paradedb/decimal-bytes
encoding numeric postgresql rust search
Last synced: 5 months ago
JSON representation
Arbitrary precision decimals with lexicographically sortable byte encoding
Host: GitHub
URL: https://github.com/paradedb/decimal-bytes
Owner: paradedb
License: mit
Created: 2026-01-23T20:59:56.000Z (6 months ago)
Default Branch: main
Last Pushed: 2026-02-03T03:39:46.000Z (5 months ago)
Last Synced: 2026-02-03T10:47:08.780Z (5 months ago)
Topics: encoding, numeric, postgresql, rust, search
Language: Rust
Homepage: https://crates.io/crates/decimal-bytes
Size: 168 KB
Stars: 7
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          # decimal-bytes

[![Crates.io](https://img.shields.io/crates/v/decimal-bytes.svg)](https://crates.io/crates/decimal-bytes)

[![codecov](https://codecov.io/gh/paradedb/decimal-bytes/graph/badge.svg)](https://codecov.io/gh/paradedb/decimal-bytes)

[![CI](https://github.com/paradedb/decimal-bytes/actions/workflows/ci.yml/badge.svg)](https://github.com/paradedb/decimal-bytes/actions/workflows/ci.yml)

[![Documentation](https://docs.rs/decimal-bytes/badge.svg)](https://docs.rs/decimal-bytes)

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)

Arbitrary precision decimals with lexicographically sortable byte encoding.

## Overview

This crate provides three decimal types optimized for database storage:

- **`Decimal`**: Variable-length arbitrary precision (up to 131,072 digits)

- **`Decimal64`**: Fixed 8-byte representation with embedded scale (precision ≤ 16 digits)

- **`Decimal64NoScale`**: Fixed 8-byte representation with external scale (precision ≤ 18 digits)

All types support PostgreSQL special values (NaN, ±Infinity) with correct sort ordering.

**Why not use `rust_decimal` or `bigdecimal`?** Those libraries are excellent for arithmetic, but their byte representations are not lexicographically sortable. You cannot compare their serialized bytes to determine numerical order - you must deserialize first. `decimal-bytes` solves this by providing a byte encoding where `bytes(a) < bytes(b)` if and only if `a < b` numerically.

## When to Use Which

| Type | Precision | Scale | Storage | Best For |

|------|-----------|-------|---------|----------|

| `Decimal64NoScale` | ≤ **18** digits | External | 8 bytes | **Columnar storage, aggregates** |

| `Decimal64` | ≤ 16 digits | Embedded | 8 bytes | Self-contained values |

| `Decimal` | Unlimited | Unlimited | Variable | Scientific, very large numbers |

## Features

- **Three storage options**: Fixed 8-byte (`Decimal64`, `Decimal64NoScale`) or variable-length (`Decimal`)

- **Columnar-friendly**: `Decimal64NoScale` enables correct aggregates with external scale

- **Lexicographic ordering**: Byte comparison matches numerical comparison

- **PostgreSQL NUMERIC compatibility**: Full support for precision, scale (including negative), and special values

- **Special values**: Infinity, -Infinity, and NaN with correct PostgreSQL sort order

## Decimal64 Usage

For most financial and business applications where precision ≤ 16 digits:

```rust

use decimal_bytes::Decimal64;

// Create with scale

let price = Decimal64::new("99.99", 2).unwrap();

assert_eq!(price.to_string(), "99.99");

assert_eq!(price.scale(), 2);

// Parse with automatic scale detection

let d: Decimal64 = "123.456".parse().unwrap();

assert_eq!(d.scale(), 3);

// Access raw components

let value = price.value();  // 9999 (scaled integer)

let scale = price.scale();  // 2

// Special values (PostgreSQL compatible)

let inf = Decimal64::infinity();

let neg_inf = Decimal64::neg_infinity();

let nan = Decimal64::nan();

// Correct sort order: -Infinity < numbers < +Infinity < NaN

assert!(neg_inf < price);

assert!(price < inf);

assert!(inf < nan);

// NaN equals NaN (PostgreSQL semantics)

assert_eq!(nan, Decimal64::nan());

```

### Decimal64 with Precision and Scale (PostgreSQL NUMERIC)

`Decimal64` fully supports PostgreSQL's `NUMERIC(precision, scale)` semantics:

```rust

use decimal_bytes::Decimal64;

// NUMERIC(5, 2) - up to 5 digits total, 2 after decimal

let d = Decimal64::with_precision_scale("123.456", Some(5), Some(2)).unwrap();

assert_eq!(d.to_string(), "123.46"); // Rounded to 2 decimal places

// Precision overflow - truncates from left (PostgreSQL behavior)

let d = Decimal64::with_precision_scale("12345.67", Some(5), Some(2)).unwrap();

assert_eq!(d.to_string(), "345.67"); // Keeps rightmost 5 digits

// NUMERIC(2, -3) - negative scale rounds to powers of 10

let d = Decimal64::with_precision_scale("12345", Some(2), Some(-3)).unwrap();

assert_eq!(d.to_string(), "12000"); // Rounded to nearest 1000

```

### Decimal64 Storage Layout

```text

64-bit packed representation:

┌──────────────────┬─────────────────────────────────────────────────────┐

│ Scale (8 bits)   │ Value (56 bits, signed)                             │

│ Byte 0           │ Bytes 1-7                                           │

└──────────────────┴─────────────────────────────────────────────────────┘

```

- **Scale byte**: 0-18 for normal values, 253/254/255 for -Infinity/+Infinity/NaN

- **Value**: 56-bit signed integer (-2^55 to 2^55-1, ~16 significant digits)

### Decimal64 Benefits

- **Fixed 8 bytes**: Predictable storage, no heap allocation, cache-friendly

- **PostgreSQL compatible**: Full NUMERIC(p,s) semantics including NaN, ±Infinity

- **Fast operations**: Single i64 comparison and serialization

## Decimal64NoScale Usage (Recommended for Columnar Storage)

`Decimal64NoScale` stores the raw scaled value without embedding the scale, enabling:

- **18 digits of precision** (vs 16 for Decimal64)

- **Correct aggregates** (SUM, MIN, MAX work directly on raw i64 values)

- **Columnar storage compatibility** (scale stored once in schema metadata)

```rust

use decimal_bytes::Decimal64NoScale;

// Scale is provided externally (e.g., from schema metadata)

let scale = 2;

let a = Decimal64NoScale::new("100.50", scale).unwrap();

let b = Decimal64NoScale::new("200.25", scale).unwrap();

// Raw values can be summed directly!

let sum = a.value() + b.value();  // 30075

assert_eq!(sum, 30075);

// Interpret result with scale

let result = Decimal64NoScale::from_raw(sum);

assert_eq!(result.to_string_with_scale(scale), "300.75");

// 18 digits supported (more than Decimal64's 16)

let big = Decimal64NoScale::new("123456789012345678", 0).unwrap();

assert_eq!(big.value(), 123456789012345678);

```

### Why Decimal64NoScale for Aggregates?

`Decimal64` embeds scale in the i64, which **corrupts aggregate results**:

```text

Decimal64:        packed = (scale << 56) | mantissa

                  SUM(a, b) = adds scale bits → WRONG!

Decimal64NoScale: stored = value * 10^scale

                  SUM(a, b) = (a+b)*scale → divide by scale → CORRECT!

```

### Decimal64NoScale Storage Layout

```text

64-bit representation:

┌─────────────────────────────────────────────────────────────────┐

│ Value (64 bits, signed) - represents value * 10^scale           │

└─────────────────────────────────────────────────────────────────┘

```

- **Value**: Full 64-bit signed integer (±9.99×10^17, ~18 significant digits)

- **Scale**: Stored externally (e.g., in database schema)

- **Special values**: `i64::MIN` (NaN), `i64::MIN+1` (-Infinity), `i64::MAX` (+Infinity)

## Decimal Usage (Arbitrary Precision)

```rust

use decimal_bytes::Decimal;

// Create decimals from strings

let a = Decimal::from_str("123.456").unwrap();

let b = Decimal::from_str("123.457").unwrap();

// Byte comparison matches numerical comparison

assert!(a.as_bytes() < b.as_bytes());

assert!(a < b);

// With precision and scale constraints (SQL NUMERIC semantics)

let d = Decimal::with_precision_scale("123.456", Some(10), Some(2)).unwrap();

assert_eq!(d.to_string(), "123.46"); // Rounded to 2 decimal places

// Negative scale (rounds to left of decimal point)

let d = Decimal::with_precision_scale("12345", Some(10), Some(-3)).unwrap();

assert_eq!(d.to_string(), "12000"); // Rounded to nearest 1000

// Efficient byte access (primary representation)

let bytes: &[u8] = d.as_bytes();

// Reconstruct from bytes

let restored = Decimal::from_bytes(bytes).unwrap();

assert_eq!(d, restored);

```

## Special Values

PostgreSQL-compatible special values with correct sort ordering:

```rust

use decimal_bytes::Decimal;

// Create special values

let pos_inf = Decimal::infinity();

let neg_inf = Decimal::neg_infinity();

let nan = Decimal::nan();

// Or parse from strings (case-insensitive)

let inf = Decimal::from_str("Infinity").unwrap();

let inf = Decimal::from_str("inf").unwrap();

let nan = Decimal::from_str("NaN").unwrap();

// Check for special values

assert!(pos_inf.is_infinity());

assert!(pos_inf.is_pos_infinity());

assert!(neg_inf.is_neg_infinity());

assert!(nan.is_nan());

assert!(!pos_inf.is_finite());

// Sort order: -Infinity < negatives < zero < positives < Infinity < NaN

assert!(neg_inf < Decimal::from_str("-1000000").unwrap());

assert!(Decimal::from_str("1000000").unwrap() < pos_inf);

assert!(pos_inf < nan);

```

### PostgreSQL vs IEEE 754 Semantics

This library follows **PostgreSQL semantics** for special values, which differ from IEEE 754 floating-point:

| Behavior | PostgreSQL / decimal-bytes | IEEE 754 float |

|----------|---------------------------|----------------|

| `NaN == NaN` | `true` | `false` |

| `NaN` ordering | Greatest value (> Infinity) | Unordered |

| `Infinity == Infinity` | `true` | `true` |

```rust

use decimal_bytes::Decimal;

let nan1 = Decimal::nan();

let nan2 = Decimal::nan();

let inf = Decimal::infinity();

// NaN equals itself (PostgreSQL behavior, unlike IEEE 754)

assert_eq!(nan1, nan2);

// NaN is greater than everything, including Infinity

assert!(nan1 > inf);

```

This makes `Decimal` suitable for use in indexes, sorting, and deduplication where consistent ordering and equality semantics are required.

## PostgreSQL Compatibility

This crate implements the PostgreSQL NUMERIC specification:

| Feature | Support |

|---------|---------|

| Max digits before decimal | 131,072 |

| Max digits after decimal | 16,383 |

| Precision constraint | ✓ |

| Scale constraint (positive) | ✓ |

| Scale constraint (negative) | ✓ |

| Infinity | ✓ |

| -Infinity | ✓ |

| NaN | ✓ |

| Rounding (ties away from zero) | ✓ |

## Storage Efficiency

The encoding matches PostgreSQL's storage efficiency (2 bytes per 4 decimal digits):

- 1 byte for sign

- 2 bytes for exponent  

- ~N/2 bytes for N-digit mantissa (BCD encoding: 2 digits per byte)

- Special values: 3 bytes each

Example: A 9-digit number like `123456789` requires only ~8 bytes total.

## Sort Order

The lexicographic byte order matches the PostgreSQL NUMERIC sort order:

```

-Infinity < negative numbers < zero < positive numbers < +Infinity < NaN

```

This enables efficient range queries in sorted key-value stores without decoding.

## Performance

### Type Comparison Summary

| Type | Max Precision | Parse | Aggregates | Best For |

|------|---------------|-------|------------|----------|

| `Decimal64NoScale` | **18 digits** | ~85 µs/1000 | **✓ Correct, 17 Gelem/s** | Columnar storage |

| `Decimal64` | 16 digits | ~136 µs/1000 | ✗ Wrong (scale corrupts) | Self-contained values |

| `Decimal` | Unlimited | ~134 µs/1000 | N/A | Arbitrary precision |

### Memory Usage

| Type | Stack | Heap | Total |

|------|-------|------|-------|

| Decimal64NoScale | 8 bytes | 0 | **8 bytes** |

| Decimal64 | 8 bytes | 0 | **8 bytes** |

| Decimal | 24 bytes | ~9 bytes | ~33 bytes |

### Decimal64NoScale Operations (Recommended for Columnar)

| Operation | Time | Notes |

|-----------|------|-------|

| Parse (`new`) | 60-85 ns | Scales with digit count |

| `to_string_with_scale()` | 18-25 ns | Scales with digit count |

| `from_raw()` | **<1 ns** | Trivial (just wrap i64) |

| Equality (`==`) | **<1 ns** | Direct i64 comparison |

| SUM 1000 values | **~59 ns** | 17 Gelem/s - just sum raw i64s |

| MIN/MAX 1000 values | **~230 ns** | 4.3 Gelem/s - direct comparison |

| `to_be_bytes()` | <1 ns | Trivial conversion |

| `from_be_bytes()` | <1 ns | Trivial conversion |

### Decimal64 Operations

| Operation | Time | Notes |

|-----------|------|-------|

| Parse (`new`) | 64-71 ns | Scales with digit count |

| `to_string()` | 19-88 ns | Scales with digit count |

| Equality (`==`) | 0.5 ns | Single i64 comparison |

| Comparison (same scale) | 1.6 ns | Direct value comparison |

| Comparison (diff scale) | 2 ns | Requires normalization |

| `to_be_bytes()` | 0.9 ns | Trivial conversion |

| `from_be_bytes()` | 0.8 ns | Trivial conversion |

| `is_nan()` / `is_infinity()` | 0.3 ns | Fast special value checks |

### Decimal Operations (Arbitrary Precision)

| Operation | Time | Notes |

|-----------|------|-------|

| Byte comparison | ~4 ns | The key use case - compare without decoding |

| `from_str` (parse) | 84-312 ns | Scales with digit count |

| `to_string` | 61-89 ns | Scales with digit count |

| `from_bytes` | 58-261 ns | With validation |

| `from_bytes_unchecked` | ~15 ns | Skip validation if bytes are trusted |

| `is_nan()` / `is_infinity()` | ~1.3 ns | Fast special value checks |

### Aggregate Performance (Key Differentiator)

For columnar storage where aggregates are important:

| Operation | Decimal64NoScale | Decimal64 | Speedup |

|-----------|------------------|-----------|---------|

| SUM 1000 values | **59 ns** (17 Gelem/s) | 275 ns (3.6 Gelem/s) | **4.7x** |

| MIN/MAX 1000 values | **230 ns** (4.3 Gelem/s) | 1001 ns (1 Gelem/s) | **4.3x** |

| Create 1000 values | **85 µs** | 136 µs | **1.6x** |

| Results correct? | **✓ Yes** | **✗ No** | - |

**Why is Decimal64NoScale faster?**

- `Decimal64NoScale.value()` returns raw i64 directly

- `Decimal64.value()` must unpack/mask the 56-bit value from the packed format

Run `cargo bench` locally to reproduce benchmarks on your hardware.

## Arithmetic Operations

This library focuses on storage and comparison, not arithmetic. Existing Rust decimal libraries (`rust_decimal`, `bigdecimal`) provide arithmetic but their byte representations are **not lexicographically sortable** - you cannot compare their serialized bytes to determine numerical order. That's the gap `decimal-bytes` fills: efficient storage with byte-level ordering for databases and search engines.

For calculations, use an established decimal library and convert:

### With `rust_decimal` (recommended for most use cases)

```toml

[dependencies]

decimal-bytes = { version = "0.1", features = ["rust_decimal"] }

```

```rust

use rust_decimal::Decimal as RustDecimal;

use decimal_bytes::Decimal;

// Convert from rust_decimal for storage

let rd = RustDecimal::new(12345, 2); // 123.45

let stored: Decimal = rd.try_into().unwrap();

// Do arithmetic with rust_decimal

let a: RustDecimal = (&stored).try_into().unwrap();

let b = RustDecimal::new(1000, 2); // 10.00

let sum = a + b; // 133.45

// Convert back for storage

let result: Decimal = sum.try_into().unwrap();

```

### With `bigdecimal` (for arbitrary precision arithmetic)

```toml

[dependencies]

decimal-bytes = { version = "0.1", features = ["bigdecimal"] }

```

```rust

use bigdecimal::BigDecimal;

use decimal_bytes::Decimal;

use std::str::FromStr;

// Convert between types

let bd = BigDecimal::from_str("123.456789012345678901234567890").unwrap();

let stored: Decimal = bd.try_into().unwrap();

let restored: BigDecimal = (&stored).try_into().unwrap();

```

## License

MIT License - see [LICENSE](LICENSE) for details.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/paradedb/decimal-bytes

Awesome Lists containing this project

README