https://github.com/av/openpuffer

Stateless S3-backed vector and FTS search (turbopuffer-compatible API)
https://github.com/av/openpuffer
Last synced: 23 days ago
JSON representation
Stateless S3-backed vector and FTS search (turbopuffer-compatible API)
Host: GitHub
URL: https://github.com/av/openpuffer
Owner: av
Created: 2026-06-02T22:04:09.000Z (about 1 month ago)
Default Branch: main
Last Pushed: 2026-06-10T21:19:12.000Z (24 days ago)
Last Synced: 2026-06-10T23:11:35.998Z (24 days ago)
Language: Rust
Size: 1.35 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Agents: AGENTS.md
Awesome Lists containing this project

README

          # openpuffer

[![CI](https://github.com/av/openpuffer/actions/workflows/ci.yml/badge.svg)](https://github.com/av/openpuffer/actions/workflows/ci.yml)

[![version](https://img.shields.io/badge/version-0.2.0-blue)](CHANGELOG.md)

Stateless vector and full-text search server backed by **S3-compatible object storage**. HTTP API is compatible with [turbopuffer](https://turbopuffer.com/docs) core write/query paths; the **on-disk architecture** follows [turbopuffer’s WAL + index model](https://turbopuffer.com/docs/architecture), not a per-document JSON store.

## How it compares to turbopuffer

| Area | turbopuffer | openpuffer (v1) |

|------|-------------|-----------------|

| **Durable layout** | WAL + index segments on object storage | Same: `meta.json`, `wal/{seq}.bin`, `index/*` under `openpuffer/{ns}/` |

| **Write ACK** | After durable WAL commit | Group-commit buffer → one WAL PUT + `meta.json` CAS per batch |

| **Indexing** | Async SPFresh-style ANN + FTS | Async background indexer: BM25 FTS, k-means centroids/clusters, attribute filter index |

| **Query** | Indexed candidates + unindexed WAL tail | `strong` (default); `eventual` skips WAL tail + catch-up on pinned views (sub-10ms warm path) |

| **Cache** | NVMe + in-process warm | Optional `--cache-dir` disk mirror + `POST …/warm` view pin |

| **Scale / polish** | Production multi-tenant | Single binary, MinIO integration tests; simplified ANN (one-level k-means, no SPFresh hierarchy) |

| **API surface** | Full product API | Core write/query/metadata/export/warm; no billing portal, CMEK, or all v2 edge cases |

**Honest gaps:** no managed cloud, no cross-region replication, ANN is a simplified two-level k-means probe (not production SPFresh), throughput is ~1 WAL commit/s/namespace by default, and filter/FTS merges are simpler than turbopuffer at scale.

**Full comparison (implemented vs missing, when to use which):** [docs/COMPARISON.md](docs/COMPARISON.md).

Design detail: [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md).

## Architecture (high level)

```

                    ┌─────────────┐

  POST /v2/...      │ write       │  group-commit buffer (time / batch size)

  (upsert/delete)   │ buffer      │

                    └──────┬──────┘

                           │ durable ACK

                           ▼

              S3: openpuffer/{ns}/

              ├── meta.json          ← index_cursor, wal_commit_seq, schema

              ├── wal/

              │   ├── 00000001.bin   ← [0x01][bincode WalEntry][crc32 LE]

              │   └── snapshot.bin   ← compaction snapshot (optional)

              └── index/             ← async indexer

                  ├── fts-*.bin

                  ├── {field}/centroids-l0.bin + centroids-l1-*.bin + clusters-*.bin

                  └── filter-*.bin

  POST /v2/.../query

       │

       ├─ load meta + index segments (disk cache if warm)

       ├─ ANN (L0/L1 probe) / BM25 / hybrid candidate generation

       ├─ apply filters (intersect before score)

       └─ score unindexed WAL tail (strong) → top_k

```

WAL replay verifies CRC on v1 segments; corrupt segments use [`fail` or `skip`](#wal-corrupt-policy) (default `fail`). Legacy segments without the `0x01` prefix remain readable.

**Consistency:** writes are visible after `wal_commit_seq` advances; queries under `consistency: "strong"` also scan WAL entries with `seq > index_cursor` until the indexer catches up.

## Features

- WAL-backed writes with strong consistency before ACK

- Background indexer (FTS BM25, vector ANN clusters, attribute filters)

- Vector ANN, BM25 FTS, hybrid `rank_by` (`Sum` / `Product`)

- Query filters (`Eq`, `In`, `And`, …), `delete_by_filter`, `patch_by_filter`, `patch_rows`

- Namespace export at `wal_commit_seq`, warm-cache endpoint

- Single static binary — no sidecar databases

## Quickstart (local dev with Docker)

Requires [Docker](https://docs.docker.com/get-docker/) for MinIO.

```bash

./scripts/dev-up.sh      # MinIO on :9000, bucket openpuffer-dev

./scripts/dev-serve.sh   # build + serve on :8080

```

**SPFresh v3 index + cold S3 path** (probed cluster fetch, no disk cache):

```bash

export OPENPUFFER_ANN_VERSION=3          # v3 index layout at build (default 2)

export OPENPUFFER_CACHE_DIR=""           # cold query: S3-only index load (same as --cache-dir "")

export OPENPUFFER_COLD_S3_CONCURRENCY=32 # parallel GETs per cold sub-batch (default 32)

# optional: OPENPUFFER_ANN_RERANK=1      # exact re-rank over probed clusters (higher recall, larger candidate pool)

./scripts/dev-serve.sh

```

After upserts and `index_cursor == wal_commit_seq`, vector queries report `performance.storage_roundtrips`, `cold_s3_keys_fetched`, and `ann_probed_clusters`. See [docs/BENCHMARKS.md](docs/BENCHMARKS.md) for probe/cold tuning.

**Benchmarks / performance (MinIO vs turbopuffer scaling):** [benchmarks/OP_VS_TPUF.md](benchmarks/OP_VS_TPUF.md) (one-page verdict); full report — [docs/reports/BENCHMARK_VS_TURBOPUFFER_SCALING_2026-06-04.md](docs/reports/BENCHMARK_VS_TURBOPUFFER_SCALING_2026-06-04.md); reproduce with `make bench-compare-tpuf` (offline gate: `./scripts/verify-op-scaling-comparison.sh`).

### Large-dataset comparison harness

Apples-to-apples **openpuffer vs turbopuffer** on a shared synthetic workload (L1 default: **100k × 128-dim**). Workloads, result JSON, and operator scripts: [benchmarks/README.md](benchmarks/README.md). Program plan: [docs/PLAN_LARGE_DATASET_BENCHMARK.md](docs/PLAN_LARGE_DATASET_BENCHMARK.md).

**Operator handoff (offline harness complete, live G3–G5 pending):** [docs/reports/LARGE_DATASET_HARNESS_HANDOFF.md](docs/reports/LARGE_DATASET_HARNESS_HANDOFF.md).

**Milestone tag** `large-dataset-harness-v1` (`c7f66a3`, annotated 2026-06-04) — offline harness only; measured `large-aws-l1.json` / `tpuf-l1.json` still require EC2 + AWS S3 + `TURBOPUFFER_API_KEY`:

```bash

git fetch --tags

git checkout large-dataset-harness-v1   # or: git show large-dataset-harness-v1

```

**Makefile targets** (same as `./scripts/verify-large-benchmark-program.sh` and operator preflights):

| Target | Purpose |

|--------|---------|

| `make bench-verify` | Offline harness gate — pytest, schemas, L1–L3 dry-runs, `@spec` facts (CI dispatch) |

| `make bench-dry-run` | Harness dry-run only — no pytest/cargo/facts; no cloud spend |

| `make bench-g2-minio` | Optional G2 MinIO correctness gates (Docker; slow) |

| `make bench-preflight` | G3+G4+overlap preflights (offline default) |

```bash

make bench-verify                                    # before any cloud spend

make bench-verify VERIFY_FLAGS="--with-g2"           # + MinIO G2 (Docker parity with CI)

make bench-dry-run                                   # L1–L3 script dry-runs only

make bench-g2-minio                                  # G2 only (faster iteration)

make bench-preflight                                 # offline cost/deps/overlap checks

make bench-preflight PREFLIGHT_FLAGS="--live --tier l1"   # EC2 live preflight

```

EC2 live runs, artifact `git add -f` policy, and post-live `@spec` activation: [benchmarks/README.md](benchmarks/README.md), [benchmarks/OPERATOR_RUNBOOK_QUICK.md](benchmarks/OPERATOR_RUNBOOK_QUICK.md).

**Recall API** (ANN vs exhaustive on indexed namespace):

```bash

curl -s -X POST "http://127.0.0.1:8080/v1/namespaces/my-ns/recall" \

  -H 'Content-Type: application/json' \

  -d '{"num": 5, "top_k": 10, "vector_field": "embedding"}'

# → {"avg_recall":0.9,"avg_ann_count":...,"avg_exhaustive_count":...}

```

Smoke test:

```bash

curl -s http://127.0.0.1:8080/health

curl -s http://127.0.0.1:8080/v1/ready

curl -s "http://127.0.0.1:8080/health?deep=1"

```

MinIO console: http://127.0.0.1:9001 (`minioadmin` / `minioadmin`). Stop storage with `docker compose down` from the repo root.

## Build

```bash

cargo build --release

```

## Run

Create your bucket first (or use `./scripts/dev-up.sh`), then:

```bash

openpuffer serve \

  --listen 0.0.0.0:8080 \

  --s3-endpoint http://127.0.0.1:9000 \

  --s3-bucket openpuffer-dev \

  --s3-access-key minioadmin \

  --s3-secret-key minioadmin

```

### Configuration

| Flag / env | Purpose |

|------------|---------|

| `--s3-endpoint`, `OPENPUFFER_S3_ENDPOINT` | S3 API URL |

| `--s3-bucket`, `OPENPUFFER_S3_BUCKET` | Bucket name |

| `--s3-region`, `OPENPUFFER_S3_REGION` | Region (default `us-east-1`) |

| `--s3-access-key` / `--s3-secret-key` | Credentials |

| `--cache-dir`, `OPENPUFFER_CACHE_DIR` | Index segment disk cache (default `/tmp/openpuffer-cache`; `""` = memory-only / **cold S3 path**) |

| `OPENPUFFER_COLD_MAX_KEYS_PER_ROUND` | Max S3 keys per cold round sub-batch (default 128) |

| `OPENPUFFER_COLD_S3_CONCURRENCY` | In-flight parallel `GetObject` per cold sub-batch (default 32) |

| `OPENPUFFER_WRITE_MAX_DELAY_MS` | Group-commit delay (default 1000) |

| `OPENPUFFER_WRITE_MAX_BATCH_OPS` | Max ops per WAL batch (default 512) |

| `OPENPUFFER_MAX_PINNED_NAMESPACES` | In-process warm view LRU (default 32) |

| `--wal-corrupt-policy`, `OPENPUFFER_WAL_CORRUPT_POLICY` | WAL replay on CRC mismatch: `fail` (default) or `skip` — see [WAL corrupt policy](#wal-corrupt-policy) |

| `--ann-version`, `OPENPUFFER_ANN_VERSION` | Index format: `2` (default) or `3` (SPFresh routing + L2 splits) |

| `--ann-coarse-probe` / `--ann-fine-probe`, `OPENPUFFER_ANN_COARSE_PROBE` / `OPENPUFFER_ANN_FINE_PROBE` | ANN L0/L1 probe counts at index build (defaults 4 / 2) |

| `--ann-rerank`, `OPENPUFFER_ANN_RERANK` | Exact re-rank over probed ANN pool (`1`/`true`; default off) |

### Prometheus metrics

Build with `--features metrics` (`cargo build --release --features metrics`). The server exposes **`GET /metrics`** (Prometheus text).

Cold/ANN counters (increment on probed cold loads and vector ANN queries):

| Metric | Meaning |

|--------|---------|

| `openpuffer_cold_s3_keys_fetched` | S3 object keys fetched on cold batch plans (each parallel sub-batch key counts once) |

| `openpuffer_ann_probed_clusters` | Cluster segments selected by ANN probe planning per vector query |

| `openpuffer_ann_probe_clamp_total` | Times on-disk probe widths were clamped to `OPENPUFFER_ANN_MAX_PROBE_CLUSTERS` at query time |

Also exported: `openpuffer_wal_commits_total`, `openpuffer_index_lag_segments`, `openpuffer_s3_get_total`, `openpuffer_query_duration_seconds`, `openpuffer_cold_query_duration_seconds`.

Per-query JSON (`POST …/query`) reports the same cold signals as `performance.cold_s3_keys_fetched` and `performance.ann_probed_clusters` without the metrics feature.

### Debug endpoints (integration builds)

Built with `--features integration` (default for `./scripts/run-integration-s3.sh`). Not for production exposure.

| Method | Path | Body | Purpose |

|--------|------|------|---------|

| GET | `/v1/debug/cache-stats` | — | `s3_get_count` from segment cache |

| POST | `/v1/debug/cache-stats/reset` | — | Reset `s3_get_count` |

| POST | `/v1/debug/namespaces/{name}/cold-plan` | Same JSON as query (`rank_by`, optional `consistency`) | Preview [`plan_cold_query`](src/s3_batch.rs): per-round key counts, `storage_roundtrips` estimate, per-field probe plan — **does not run the query** (fetches `meta.json` and L0 only when vector probes are present) |

Example (cold cache, indexed namespace):

```bash

curl -sS -X POST "http://127.0.0.1:8080/v1/debug/namespaces/my-ns/cold-plan" \

  -H 'Content-Type: application/json' \

  -d '{"rank_by":["vector","ANN","embedding",[1,0,0]],"consistency":"eventual"}' | jq .

```

### WAL corrupt policy

v1 WAL segments on S3 use `[0x01][bincode WalEntry][crc32 LE]`. On replay, openpuffer verifies the CRC over the payload. If a segment is truncated, tampered, or has a bad checksum:

| Policy | Flag / env | Behavior |

|--------|------------|----------|

| **`fail`** (default) | `--wal-corrupt-policy fail` or `OPENPUFFER_WAL_CORRUPT_POLICY=fail` | Namespace load aborts; queries return **500** with a turbopuffer-style `{"error":"…","status":"error"}` mentioning corrupt WAL |

| **`skip`** | `--wal-corrupt-policy skip` or `OPENPUFFER_WAL_CORRUPT_POLICY=skip` | Log the corrupt segment and **continue** replay; earlier segments stay applied; the corrupt segment’s writes are invisible |

Set the policy at **process start** (`openpuffer serve`). Legacy WAL blobs without the `0x01` version byte still replay (no CRC on those segments).

**Example (recovery after partial upload):**

```bash

openpuffer serve \

  --wal-corrupt-policy skip \

  --s3-endpoint http://127.0.0.1:9000 \

  --s3-bucket openpuffer-dev \

  ...

```

Integration coverage: `corrupt_wal_segment_on_minio_fail_and_skip_policies` in `tests/integration_s3.rs` (flips one CRC byte on S3, asserts fail → 500 and skip → doc from seq 1 only).

### Operations guide

1. **Cold start** — point at bucket; first write creates `meta.json` + `wal/00000001.bin`.

2. **Indexing lag** — check `GET /v1/namespaces/{name}`: `index_cursor` should reach `wal_commit_seq`. Queries still return recent writes via WAL tail under strong consistency.

3. **Warm a hot namespace** — `POST /v1/namespaces/{name}/warm` prefetches index objects and pins an in-memory view (fewer S3 round-trips on the same process).

4. **Export** — `GET /v1/namespaces/{name}/export?limit=10000&last_id=…` (or POST with JSON body) for a consistent snapshot at `wal_commit_seq`.

5. **Multi-instance** — any number of stateless `serve` processes can share one bucket; per-namespace writes serialize via S3 CAS + in-process commit lock.

6. **Restart** — no local durable state required; replay WAL from S3 on first query.

## API

| Method | Path | Purpose |

|--------|------|---------|

| GET | `/health` | Liveness (always `ok` unless `?deep=1`) |

| GET | `/v1/ready` | Traffic readiness — S3 configured and reachable (`503` if not) |

| GET | `/metrics` | Prometheus scrape (`--features metrics`) |

| GET | `/v1/namespaces` | List namespaces + metadata |

| GET | `/health?deep=1` | S3 probe (`HeadBucket` + `openpuffer/` read); `degraded` if down |

| GET | `/v1/namespaces/{name}` | `approx_row_count`, `index_status`, `unindexed_bytes`, cursors |

| GET/POST | `/v1/namespaces/{name}/export` | Paginated export (`last_id`, `limit`, `format=ndjson`) |

| POST | `/v1/namespaces/{name}/warm` | Prefetch index + pin view |

| POST | `/v1/namespaces/{name}/recall` | ANN vs exhaustive recall@k (`num`, `top_k`, optional `filters`, `vector_field`); response `avg_recall`, `avg_ann_count`, `avg_exhaustive_count` |

| POST | `/v2/namespaces/{name}` | Write (upsert, patch, delete, `delete_by_filter`, `patch_by_filter`, `schema`) |

| POST | `/v2/namespaces/{name}/query` | Vector / FTS / hybrid / filtered search |

| DELETE | `/v2/namespaces/{name}` | Delete namespace prefix |

Query responses include `performance` (`candidates`, `candidates_ratio`, `exhaustive_search_count`, …) and optional headers `X-Openpuffer-Candidates`, `X-Openpuffer-Candidates-Fraction`.

## Test

### Test matrix

| Suite | Command | Count | Docker | Notes |

|-------|---------|-------|--------|-------|

| **Unit** | `cargo test` | ~158 | No | Library + WAL/index logic |

| **Integration (MinIO)** | `cargo test -F integration` | **51** | Yes | `tests/integration_s3.rs` — testcontainers MinIO; S3 Head/List/Get + WAL decode |

| **Perf** | `cargo test -F perf` | 1 | Yes | 5k-doc ANN `candidates_ratio` regression |

| **External S3** | `cargo test -F integration --test integration_external_s3 -- --ignored` | 1 | Optional | Compose MinIO or `OPENPUFFER_TEST_S3_*` |

| **Large stress** | `cargo test --release -F large_stress --test stress_50k -- --ignored` | 3 | Yes | 50k warm + v3 cold probed mid-tier; not in default CI |

Typical dev run (unit + integration + perf):

```bash

cargo test                              # unit (~158), no Docker

cargo build --features integration      # build server binary for integration harness

cargo test -F integration               # 51 MinIO scenarios (~60–70s)

cargo test -F perf                      # ANN candidate_ratio on 5k docs

```

**S3 integration (requires Docker) — recommended:**

```bash

./scripts/run-integration-s3.sh

```

Builds with `--features integration` and runs all **51** `integration_s3` tests against **real MinIO** (testcontainers). Tests assert **Head/List/Get** on `meta.json`, `wal/`, and `index/` (decode WAL, segment growth, copy key parity) — not HTTP-only mocks.

### Optional 50k namespace stress (`large_stress`)

Not part of the default matrix — `#[ignore]` so `cargo test -F integration` stays fast. **Nightly CI:** [`.github/workflows/nightly-stress.yml`](.github/workflows/nightly-stress.yml) (03:00 UTC + manual `workflow_dispatch`).

```bash

cargo build --release --features large_stress

cargo test --release -F large_stress --test stress_50k -- --ignored --nocapture

```

Upserts **50k** docs in **5×10k** `upsert_columns` batches with **~1.1s** spacing (WAL rate limit), waits for `index_cursor == wal_commit_seq` (300s wall timeout). Tests:

- `fifty_thousand_docs_indexed_query` — v2 default, warm ANN `candidates_ratio < 0.2`

- `fifty_thousand_docs_v3_cold_probed_validation` — `--ann-version 3`, strong cold probed path: `storage_roundtrips ≤ 4`, `recall@10 ≥ 0.86`, `candidates_ratio < 0.2` (see [docs/BENCHMARKS.md](docs/BENCHMARKS.md) mid-tier)

- `v3_cold_probed_wiring_at_2k` — fast 2k wiring for the same cold metrics

**Use `--release`** — debug builds may not index 50k within 300s. On a typical dev machine (release): warm stress **~40–45s**; v3 cold gate **~1–2 min** including recall probes.

### Testing against real S3

Two ways to hit a **real** S3-compatible endpoint (MinIO or AWS) — not mocks:

| Mode | Command | Backend |

|------|---------|---------|

| **Default** | `./scripts/run-integration-s3.sh` | Ephemeral MinIO (testcontainers) |

| **Compose MinIO** | `./scripts/run-integration-s3.sh external` | `docker-compose.test.yml` on `:9000` |

| **Your bucket** | Set `OPENPUFFER_TEST_S3_*` env vars | Any S3-compatible API |

**Compose external tests** (starts MinIO if `:9000` is not already healthy, creates `openpuffer-integration` bucket):

```bash

./scripts/run-integration-s3.sh external

```

**Manual env** (same variables the script sets; use for CI or a shared MinIO/AWS bucket):

```bash

export OPENPUFFER_TEST_S3_ENDPOINT=http://127.0.0.1:9000

export OPENPUFFER_TEST_S3_BUCKET=openpuffer-integration

export OPENPUFFER_TEST_S3_ACCESS_KEY=minioadmin

export OPENPUFFER_TEST_S3_SECRET_KEY=minioadmin

cargo test -F integration --test integration_external_s3 -- --ignored

```

**Serve against the same bucket** (after `external` or with your own endpoint):

```bash

export OPENPUFFER_S3_ENDPOINT=http://127.0.0.1:9000

export OPENPUFFER_S3_BUCKET=openpuffer-integration

export OPENPUFFER_S3_ACCESS_KEY=minioadmin

export OPENPUFFER_S3_SECRET_KEY=minioadmin

./scripts/dev-serve.sh

```

Stop the test compose stack: `docker compose -f docker-compose.test.yml down`.

### What integration tests assert on S3

- **Head/List/Get** on `meta.json`, `wal/{seq:08}.bin`, and `index/*` (not HTTP-only)

- **Decode** bincode `WalEntry` from `wal/*.bin` and compare doc ids to HTTP export

- **Index layout**: `fts-*.bin`, `filter-*.bin`, `centroids-l0.bin`, `centroids-l1-*.bin` (non-zero size)

- **Incremental growth**: FTS/filter segment sizes or `fts_segment_ids` / `filter_segment_ids` chains grow after a second WAL batch (`s3_fts_and_filter_segments_grow_on_minio`)

- **Copy parity**: `copy_from_namespace` duplicates every source key under the dest prefix (`s3_copy_from_namespace_duplicates_all_keys`)

- **No** legacy `docs/{id}.json` or `manifest.json`

## License

MIT OR Apache-2.0
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/av/openpuffer

Awesome Lists containing this project

README