An open API service indexing awesome lists of open source software.

https://github.com/nodedb-lab/nodedb

A local-first, real-time, edge-to-cloud hybrid database for multi-modal workloads
https://github.com/nodedb-lab/nodedb

columnar database distributed graph kv-store local nosql rag sql timeseries vector

Last synced: about 1 month ago
JSON representation

A local-first, real-time, edge-to-cloud hybrid database for multi-modal workloads

Awesome Lists containing this project

README

          



NodeDB

The distributed multi-model database for AI and agent workloads.


Eight database engines in a single Rust binary. One SQL dialect. Zero network hops between
vector search, graph traversal, document storage, columnar analytics, timeseries, key-value,
full-text search, and multi-dimensional arrays.


Quickstart
·
Docs
·
Benchmarks
·
NodeDB-Lite
·
CLI



Join the NodeDB Discord



CI status


Latest release


License


GitHub stars

NodeDB replaces the combination of PostgreSQL + pgvector + Redis + Neo4j + ClickHouse + Elasticsearch with a single process. Graph queries that feed vector search, full-text ranking, and columnar aggregation execute in one engine with shared storage, shared memory, and one planner.

## Why NodeDB

- **One binary, not a polyglot stack.** No inter-service networking, no schema drift between systems, no data synchronization pipelines. A graph query that feeds a vector search that filters by full-text relevance executes in one process.
- **PostgreSQL wire protocol.** Connect with `psql` or any PostgreSQL client library. Standard SQL with engine-specific extensions where SQL can't express the operation.
- **Edge to cloud.** The same engines run embedded on phones and browsers (NodeDB-Lite, WASM) with CRDT-based offline-first sync to the server.
- **Serious about performance.** Thread-per-Core data plane with io_uring, SIMD-accelerated distance functions, zero-copy MessagePack transport, per-column compression (ALP, FastLanes, FSST, Gorilla). See benchmarks below.

## Performance

**Timeseries ingest + query benchmark** — 10M rows, high-cardinality DNS telemetry (50K+ unique domain names). Single node, NVMe storage.

### Ingest

| Engine | Rate | Time | Memory | Disk |
| ----------- | ------------ | ------------ | ---------- | ------------ |
| **NodeDB** | **93,450/s** | 107s | **120 MB** | 2,217 MB |
| TimescaleDB | 56,615/s | 177s | 963 MB | 2,802 MB |
| ClickHouse | 53,905/s | 186s | 1,035 MB | **1,647 MB** |
| InfluxDB | 22,715/s | 88s (2M cap) | 1,656 MB | 982 MB |

### Queries (ms, best of 3)

| Query | NodeDB | ClickHouse | TimescaleDB | InfluxDB (2M) |
| -------------------------- | ------- | ---------- | ----------- | ------------- |
| `COUNT(*)` | **<1** | 1 | 423 | 13,110 |
| `WHERE qtype=A COUNT` | 47 | **6** | 347 | 5,297 |
| `WHERE rcode=SERVFAIL` | 41 | **6** | 334 | 1,048 |
| `GROUP BY qtype` | 56 | **15** | 597 | 12,426 |
| `GROUP BY rcode` | 52 | **16** | 604 | 13,183 |
| `GROUP BY cached+AVG` | 120 | **33** | 677 | 13,652 |
| `GROUP BY client_ip (10K)` | **141** | 157 | 660 | 14,301 |
| `GROUP BY qname (50K+)` | 2,665 | **288** | 3,644 | 16,720 |
| `time_bucket 1h` | 101 | **30** | 603 | -- |
| `time_bucket 5m+qtype` | 138 | **99** | 711 | -- |

NodeDB is not a specialized timeseries database, yet it ingests 1.65x faster than TimescaleDB and 1.73x faster than ClickHouse with 8x less memory. Query latency is competitive with ClickHouse on low-cardinality aggregations and within 3-5x on high-cardinality GROUP BY. This is the tradeoff of a general-purpose engine: you get one system instead of five, with performance that stays in the same ballpark as specialized tools.

## Engines

| Engine | What it replaces | Key capability |
| -------------------------------------------- | ---------------------------- | ----------------------------------------------------------------------------------------------------------------------- |
| [Vector](docs/vectors.md) | pgvector, Pinecone, Weaviate | HNSW with SQ8/PQ quantization, adaptive bitmap pre-filtering |
| [Graph](docs/graph.md) | Neo4j, Amazon Neptune | CSR adjacency, 13 algorithms, Cypher-subset MATCH, GraphRAG |
| [Document](docs/documents.md) | MongoDB, CouchDB | Schemaless (MessagePack + CRDT) or Strict (Binary Tuples, O(1) field access). Typeguards for gradual schema enforcement |
| [Columnar](docs/columnar.md) | ClickHouse, DuckDB | Per-column codecs (ALP, FastLanes, FSST), predicate pushdown, HTAP bridge |
| [Timeseries](docs/timeseries.md) | TimescaleDB, InfluxDB | ILP ingest, continuous aggregation, PromQL, approximate aggregation |
| [Spatial](docs/spatial.md) | PostGIS | R\*-tree, geohash, H3, OGC predicates, hybrid spatial-vector |
| [Key-Value](docs/kv.md) | Redis, DynamoDB | O(1) lookups, TTL, sorted indexes, rate limiting, SQL-queryable |
| [Full-Text Search](docs/full-text-search.md) | Elasticsearch | BMW BM25, 27-language support, CJK bigrams, fuzzy, hybrid vector fusion |
| [Array (NDArray)](docs/array.md) | TileDB, Zarr | Multi-dimensional tiles, Z-order indexing, bitemporal support, tile-level retention |

## Install

```bash
# Docker
docker run -d \
-p 6432:6432 -p 6433:6433 -p 6480:6480 \
-v nodedb-data:/var/lib/nodedb \
farhansyah/nodedb:latest

# Cargo
cargo install nodedb
```

Requires Linux kernel >= 5.1 (io_uring). Connect:

```bash
ndb # native CLI (connects to localhost:6433)
psql -h localhost -p 6432 # or any PostgreSQL client
```

```sql
CREATE COLLECTION users;

-- Standard SQL
INSERT INTO users (name, email, age) VALUES ('Alice', 'alice@example.com', 30);

-- Object literal syntax (same result)
INSERT INTO users { name: 'Bob', email: 'bob@example.com', age: 25 };

-- Batch insert
INSERT INTO users [
{ name: 'Charlie', email: 'charlie@example.com', age: 35 },
{ name: 'Dana', email: 'dana@example.com', age: 28 }
];

SELECT * FROM users WHERE age > 25;
```

New here? Start with the [Quickstart](https://nodedb.dev/docs/introduction/quickstart/) on the official documentation site: **[nodedb.dev/docs](https://nodedb.dev/docs)**.

## Deployment Modes

| Mode | Use case |
| ------------------- | ---------------------------------------------------------------------------- |
| **Origin (server)** | Full distributed database. Multi-Raft, io_uring, pgwire. Horizontal scaling. |
| **Origin (local)** | Same binary, single-node. No cluster overhead. |
| **NodeDB-Lite** | Embedded library for phones, browsers, desktops. CRDT sync to Origin. |

## NodeDB-Lite

All eight engines as an embedded library. Linux, macOS, Windows, Android, iOS, and browser (WASM, experimental).

- **Lite only** -- local-first apps that don't need a server. Vector search, graph, FTS, documents, arrays, all in-process with sub-ms reads.
- **Lite + Origin** -- offline-first with CRDT sync. Writes happen locally, deltas merge to Origin when online. Multiple devices converge regardless of order.
- **Same API** -- the `NodeDb` trait is identical across Lite and Origin. Switch between embedded and server without changing application code.

See [NodeDB-Lite](https://github.com/NodeDB-Lab/nodedb-lite) for platform details and sync configuration.

## Key Features

**Write-time validation** -- Typeguards enforce types, required fields, CHECK constraints, and DEFAULT/VALUE expressions on schemaless collections. Graduate to strict schema with `CONVERT COLLECTION x TO document_strict`.

**Bitemporal queries** -- System time (audit trail) and valid time (temporal semantics) across all engines. Query data as it existed in the past, or as it was valid on a past date. GDPR-compliant tile purge on array engine.

**Multi-dimensional arrays** -- Scientific computing with Z-order indexed tiles, tile-level compression, and bitemporal support. Combine with vector/graph/text in fused queries via cross-engine identity.

**Real-time** -- CDC change streams with consumer groups (~1-5ms latency). Streaming materialized views. Durable topics. Cron scheduler. LISTEN/NOTIFY. All powered by the Event Plane.

**Programmability** -- Stored procedures with `IF/FOR/WHILE/LOOP`. User-defined functions. Triggers (async, sync, deferred). `SECURITY DEFINER`.

**Security** -- RBAC with GRANT/REVOKE. Row-level security with `$auth.*` context across all engines. Hash-chained audit log. Multi-tenancy with per-tenant encryption. JWKS, mTLS, API keys.

**Six wire protocols** -- pgwire (PostgreSQL), HTTP/REST, WebSocket, RESP (Redis), ILP (InfluxDB line protocol), native MessagePack.

## Tools

- **[`ndb`](https://github.com/NodeDB-Lab/nodedb-cli)** -- Native CLI with TUI, syntax highlighting, and tab completion. Alternative to `psql`.
- **[NodeDB Studio](https://github.com/NodeDB-Lab/nodedb-studio)** -- GUI client for managing collections, browsing data, and monitoring. _(coming soon)_
- **[nodedb-bench](https://github.com/NodeDB-Lab/nodedb-bench)** -- Performance benchmarks against competing databases.

## Documentation

The official documentation site is **[nodedb.dev/docs](https://nodedb.dev/docs)** — start with the [Quickstart](https://nodedb.dev/docs/introduction/quickstart/).

In-repo references:

- [Getting Started](docs/getting-started.md) -- Build, run, connect
- [Architecture](docs/architecture.md) -- Three-plane execution model
- [Engine Guides](docs/README.md) -- Deep dives into each engine
- [Security](docs/security/README.md) -- Auth, RBAC, RLS, audit, multi-tenancy
- [Real-Time](docs/real-time.md) -- CDC, pub/sub, LIVE SELECT
- [NodeDB-Lite](https://github.com/NodeDB-Lab/nodedb-lite) -- Embedded edge database
- [AI Patterns](docs/ai/README.md) -- RAG, GraphRAG, agent memory, feature store

## Contributing

We welcome bug fixes, engine improvements, new codecs and analyzers, test coverage, and documentation. Read [CONTRIBUTING.md](CONTRIBUTING.md) before opening a PR — NodeDB's Three-Plane execution model has hard rules that reviewers enforce.

## Building from Source

For development or contributing:

```bash
git clone https://github.com/NodeDB-Lab/nodedb.git
cd nodedb
cargo build --release
cargo install cargo-nextest --locked # one-time
cargo nextest run --all-features
```

## Release Status

NodeDB Origin is in **public beta** as of **v0.1.0 (2026-05-07)**. All eight engines are feature-complete and covered by tests. The wire protocols (pgwire, HTTP, native MessagePack, RESP, ILP, WebSocket) are stable — clients written against 0.1.0 will keep working through 1.0.

**v0.1.0 — Beta (today).** Build new products on it. The public surface (SQL dialect, wire protocols, configuration) is stable; expect internal changes (storage layout, on-disk format, replication internals) between minor releases. Patch and minor bumps will land as needed — if something requires a 0.2, we ship a 0.2. The two months between beta and 1.0 are deliberately for real workloads to surface edge cases we can't manufacture in-house.

**v1.0.0 — Production-ready (target: 2026-07-07).** What 1.0 guarantees:

- **API & SQL stability** — semver from 1.0 onward. No breaking SQL or client-API changes within a major.
- **Wire protocol stability** — pgwire, HTTP, native MessagePack, RESP, ILP, WebSocket frozen.
- **On-disk format stability** — no breaking migrations within 1.x. Forward-compatible upgrades only.
- **Cluster & Raft stability** — rolling upgrades supported within 1.x; no quorum-breaking changes.
- **Performance SLAs** — published p50/p99 targets per engine, regression-gated in CI.
- **Security audit** — third-party audit completed and findings remediated before 1.0 ships.
- **Storage, backup, and recovery** — fully exercised under fault injection, sustained load, and crash-restart cycles.

Pre-1.0 versions may change internals between releases — those changes are critical-path work (storage, backup, security, recovery) that has to be hardened in real production conditions before we put a stability stamp on it. The wire protocol and SQL surface won't break; everything underneath is fair game until 1.0.

> **Note:** This release track applies to **NodeDB Origin** (the server) only. [NodeDB-Lite](https://github.com/NodeDB-Lab/nodedb-lite), [`ndb` CLI](https://github.com/NodeDB-Lab/nodedb-cli), and [NodeDB Studio](https://github.com/NodeDB-Lab/nodedb-studio) are versioned independently on their own tracks.

**Want to test or experiment with NodeDB?** Join our [Discord](https://discord.gg/s54gDMVc7B) — we provide full support for early adopters during the beta.

## Contributors


NodeDB Contributors

## License

NodeDB uses a dual-license model:

- **Shared engine crates** (`nodedb-types`, `nodedb-vector`, `nodedb-graph`, `nodedb-fts`, `nodedb-spatial`, `nodedb-codec`, `nodedb-columnar`, `nodedb-array`, `nodedb-sql`, `nodedb-client`, `nodedb-query`, `nodedb-strict`) — [Apache 2.0](LICENSE-APACHE). Use them freely in your own projects, SDKs, and tools.
- **Server crates** (`nodedb`, `nodedb-wal`, `nodedb-raft`, `nodedb-cluster`, `nodedb-bridge`, `nodedb-mem`, `nodedb-crdt`) — [Business Source License 1.1](LICENSE). Free for any use except offering NodeDB as a hosted database service (DBaaS). Converts to Apache 2.0 on 2030-05-01.