https://github.com/NoKV-Lab/NoKV
AI native distributed file system
https://github.com/NoKV-Lab/NoKV
distributed-systems filesystem nokv object-storage raft rust
Last synced: about 10 hours ago
JSON representation
AI native distributed file system
- Host: GitHub
- URL: https://github.com/NoKV-Lab/NoKV
- Owner: NoKV-Lab
- License: apache-2.0
- Created: 2024-08-11T14:19:45.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2026-06-30T06:06:22.000Z (6 days ago)
- Last Synced: 2026-06-30T08:08:55.522Z (5 days ago)
- Topics: distributed-systems, filesystem, nokv, object-storage, raft, rust
- Language: Rust
- Homepage: https://nokv.io/
- Size: 360 MB
- Stars: 439
- Watchers: 8
- Forks: 50
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
- Agents: AGENTS.md
- Dco: DCO
Awesome Lists containing this project
- awesome-rust - NoKV-Lab/NoKV - AI-native distributed filesystem. [](https://github.com/NoKV-Lab/NoKV/actions/workflows/rust.yml) (Applications / Database)
- fucking-awesome-rust - NoKV-Lab/NoKV - AI-native distributed filesystem. [](https://github.com/NoKV-Lab/NoKV/actions/workflows/rust.yml) (Applications / Database)
README
Metadata control plane for object-backed agent artifacts.
Docs ·
Why Filesystems ·
Quick Start ·
Benchmarks ·
Discussions ·
Contributing ·
DeepWiki
## Latest update
> **NoKV × Lingtai** is now a design partner collaboration.
>
> [中文公告 →](https://github.com/orgs/NoKV-Lab/discussions/380)
>
> [Read the announcement(English) →](https://github.com/orgs/NoKV-Lab/discussions/378)
## Recognition
CNCF Landscape
Listed in AI Native Infra / Storage and Cloud Native Storage.
DBDB.io
Profiled on DBDB.io. The current NoKV entry refers to the Rust filesystem product line.
## What Is NoKV?
NoKV is a metadata control plane for object-backed agent artifacts: run
outputs, log files, checkpoints, reports, and citable evidence in one
filesystem-shaped namespace. For the longer interface argument, see
[Agents Want Filesystems](https://nokv.io/blog/agents-want-filesystems).
It is not a trace database. Keep JSONL, SQLite, or Postgres as the source of
truth for runtime events; use NoKV as the agent-facing namespace over the
artifacts and evidence those systems produce.
NoKV keeps namespace metadata in its own path-native engine
([Holt](https://crates.io/crates/holt)) and stores file bodies as immutable
blocks in S3-compatible object storage such as RustFS, MinIO, Ceph RGW, or AWS
S3.
```text
FUSE / SDK / CLI
-> NoKV metadata service (self-contained; no separate metadata DB to run)
-> Holt inode/dentry metadata
-> S3-compatible object store for file bodies
```
NoKV owns namespace truth, metadata transactions, snapshots, watches, and
object-reference GC. The object store owns byte durability and replication. The
metadata engine is built in, so local deployments operate a filesystem rather
than a filesystem plus a separate Redis, MySQL, or TiKV cluster.
## Why NoKV
Agent workflows are artifact-heavy; their workspaces aren't. Every run leaves
behind configs, metrics, logs, checkpoints — and that state scatters across
folders, JSON files, object-store keys, and database rows. Agents pay a
navigation tax in tokens every time they go looking. NoKV gives that state one
address, with the metadata guarantees the workload actually needs:
- **Checkpoints publish atomically.** Readers see the complete new checkpoint
or the previous one — never a half-written file, even across a crash.
- **Snapshots are time travel.** Pin a frozen view of any subtree and keep
reading it while jobs write; GC never deletes what a snapshot still needs.
- **Changes are events, not polls.** Every create, rename, and publish lands as
a typed, replayable event with a cursor.
- **Artifacts carry body references and digests**, with cleanup of failed
staged uploads.
- **Bodies are immutable, versioned blocks.** Replacement publishes a new
generation, so node-local caches never invalidate object bytes after publish.
The primary write model is write-once publish, matching how datasets,
checkpoints, and artifacts are commonly written.
## 🤝 Contributing
Contributions are welcome, from first-timers to seasoned Rustaceans. Read [CONTRIBUTING.md](CONTRIBUTING.md) to get started: it covers setup, conventions, and the review gate. Pick up a [good first issue](https://github.com/NoKV-Lab/NoKV/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22), or follow the recommended newcomer track in [#354](https://github.com/NoKV-Lab/NoKV/issues/354) (MCP server for the agent namespace surface).
To understand the project first, read [Why filesystems](https://nokv.io/blog/agents-want-filesystems), [the metadata engine](https://nokv.io/blog/holt-in-nokv), and [the benchmark](https://nokv.io/benchmark).
## 📚 Documentation
- [Architecture](docs/architecture.md)
- [Product Design](docs/product-design.md)
- [AI-Native Metadata HA And Fast Path](docs/metadata-ha-fast-path.md)
- [Metadata Schema](docs/metadata-schema.md)
- [Object Layout](docs/object-layout.md)
- [Checkpointing](docs/checkpointing.md)
- [RustFS Backend](docs/rustfs.md)
- [Benchmarks](docs/benchmarks.md)
## 🤖 The Agent Interface
`ls` · `stat` · `catalog` · `find` · `aggregate` · `read` · `grep`
Seven verbs, one progressive-disclosure surface: an agent discovers what
exists, learns what is queryable, and pays to read only what it needs.
Predicates, sort, and projection are pushed into the engine, so a "top-5 runs
by val_loss" report costs two calls — one `catalog`, one `find`. `grep` sweeps
a subtree and returns line-numbered matches with citable evidence URIs
(`nokv-native://path@generation:N#L3`).
The verbs live in [`nokv-agent`](crates/nokv-agent/src/lib.rs): the tool
definitions are LLM-ready JSON schemas, and `execute_agent_tool` routes calls
over the same `AgentNamespace` trait whether the namespace is embedded
(in-process) or remote (metadata RPC via `nokv-client`). The crate is
transport-free — it depends only on `nokv-meta`, `nokv-object`, and
`nokv-types`. See the [contributor handbook](docs/development/nokv-agent.md).
**Today** the agent verbs ship in the Rust SDK, the `nokv` CLI and FUSE
mount, and a native MCP server over stdio transport. Run `nokv mcp` to
serve the seven read-only agent tools to any MCP-capable client.
```bash
cargo run --release -p nokv --bin nokv -- mcp
```
To configure it in an MCP client (e.g. Cursor, VS Code, Claude Desktop):
```json
{
"nokv-mcp": {
"command": "/path/to/NoKV/target/release/nokv",
"args": ["mcp"]
}
}
```
The same `--server-bind`, `--object-backend`, `--mount`, and control-plane
flags from the rest of the CLI apply, e.g.:
```bash
nokv --server-bind 127.0.0.1:7777 --object-backend rustfs mcp
```
**v1 constraints:** stdio transport only (no HTTP/SSE); read-only — the
seven existing agent tools only, no write or publish tools; no network
registration path for `register_namespace_index` (it remains
embedded-only — see the [contributor handbook](docs/development/nokv-agent.md)
section 6 for why).
## NoKV vs JuiceFS
NoKV follows the same high-level separation used by systems like JuiceFS and
3FS: metadata is separate from file body storage. The difference is that NoKV
ships its metadata engine as part of the filesystem and optimizes for
agent-workspace and artifact publish/read patterns first.
| | JuiceFS | NoKV |
| --- | --- | --- |
| Metadata engine | External DB such as Redis, MySQL, or TiKV | Built-in, path-native Holt engine |
| Atomic checkpoint publish | POSIX rename/write semantics over the metadata engine | First-class publish-by-generation primitive |
| Block model | Slice/block model supporting broad POSIX behavior | Immutable object blocks plus new-generation manifests |
| Workspace-native primitives | Layered on top of the filesystem | Snapshots, typed watch, body descriptors, and GC floors are core metadata concepts |
| Agent query surface | None | `ls`/`stat`/`catalog`/`find`/`aggregate`/`read`/`grep` with push-down and line-numbered evidence |
| POSIX completeness | Mature production filesystem | P0 subset implemented; still hardening compatibility gates |
| Maturity | Production, large deployments | Young Rust implementation; single-node local mode is usable, replication is roadmap |
NoKV is an object-backed filesystem with a sharded Holt metadata plane: multiple
metadata shards (one Holt engine each) behind long-running metadata servers,
routed through an etcd control plane, with cross-shard grafts presenting a single
FUSE namespace across shards. Metadata HA today is single-writer-per-shard with
checkpoint-image + shared-log, epoch-fenced failover — not yet consensus
replication — so it is not yet a JuiceFS/3FS class production-HA distributed
filesystem.
## 🚦 Quick Start
Build and test:
```bash
cargo test --workspace
cargo build --release -p nokv --bin nokv
```
Start a local RustFS-compatible S3 endpoint and create the default bucket:
```bash
mkdir -p /tmp/rustfs-data
RUSTFS_ACCESS_KEY=rustfsadmin \
RUSTFS_SECRET_KEY=rustfsadmin \
rustfs server --address 127.0.0.1:9000 /tmp/rustfs-data &
AWS_ACCESS_KEY_ID=rustfsadmin \
AWS_SECRET_ACCESS_KEY=rustfsadmin \
aws --endpoint-url http://127.0.0.1:9000 \
s3api create-bucket --bucket nokv
```
By default NoKV expects bucket `nokv` at `http://127.0.0.1:9000` with
development credentials `rustfsadmin` / `rustfsadmin`. See
[docs/rustfs.md](docs/rustfs.md) for other deployment modes.
Start the metadata server, then initialize the namespace. Every other command
talks to the server on `127.0.0.1:7777`, so keep it running:
```bash
cargo run --release -p nokv --bin nokv -- serve &
cargo run --release -p nokv --bin nokv -- init
```
Publish and read an artifact:
```bash
cargo run --release -p nokv --bin nokv -- \
put-artifact /runs/1/checkpoint.bin ./checkpoint.bin
cargo run --release -p nokv --bin nokv -- \
cat /runs/1/checkpoint.bin > restored.bin
```
Mount with FUSE:
```bash
mkdir -p /tmp/nokv-mount
cargo run --release -p nokv --bin nokv -- \
mount /tmp/nokv-mount
```
On macOS this requires macFUSE. NoKV passes the `noappledouble` mount option to
avoid Finder/resource-fork AppleDouble sidecars; user xattr roundtrip is
covered by the FUSE smoke test.
## 🧩 Crates
| Crate | Role |
| --- | --- |
| [`nokv-types`](https://crates.io/crates/nokv-types) | Storage-neutral namespace model |
| [`nokv-protocol`](https://crates.io/crates/nokv-protocol) | Framed metadata RPC DTOs and binary codec |
| [`nokv-object`](https://crates.io/crates/nokv-object) | S3-compatible object body storage |
| [`nokv-meta`](https://crates.io/crates/nokv-meta) | Schema, `MetadataCommand`, Holt store, service core |
| [`nokv-control`](https://crates.io/crates/nokv-control) | Shard ownership, epochs, and failover coordination |
| [`nokv-agent`](https://crates.io/crates/nokv-agent) | Transport-free agent tool surface (the seven verbs) |
| [`nokv-client`](https://crates.io/crates/nokv-client) | Rust SDK over the metadata service |
| [`nokv-fuse`](https://crates.io/crates/nokv-fuse) | Low-level FUSE frontend |
| [`nokv-server`](https://crates.io/crates/nokv-server) | Long-running metad process and framed RPC |
| [`nokv`](https://crates.io/crates/nokv) | `nokv` CLI binary |
## 📄 License
Apache-2.0. See [LICENSE](LICENSE).