{"id":35209622,"url":"https://github.com/cyberlife-coder/velesdb","last_synced_at":"2026-04-30T20:01:11.576Z","repository":{"id":329166065,"uuid":"1118411810","full_name":"cyberlife-coder/VelesDB","owner":"cyberlife-coder","description":"VelesDB is a local‑first AI data engine written in Rust that unifies vectors, full‑text and graph in a single file with a familiar SQL‑like language.  Instead of sending every RAG or semantic search query to a remote cluster, VelesDB runs directly on your server, laptop, browser, mobile or edge device — no cloud dependency, no external services, ..","archived":false,"fork":false,"pushed_at":"2026-04-25T06:12:19.000Z","size":32216,"stargazers_count":49,"open_issues_count":37,"forks_count":6,"subscribers_count":1,"default_branch":"develop","last_synced_at":"2026-04-25T07:33:22.859Z","etag":null,"topics":["ai","ai-memory","all-in-one-databse","columnstore-database","embeddings","graph-database","hnsw","local-first","machine-learning","rag","rust","search-engine","vector-database"],"latest_commit_sha":null,"homepage":"https://velesdb.com","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cyberlife-coder.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-17T18:01:02.000Z","updated_at":"2026-04-24T17:35:16.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/cyberlife-coder/VelesDB","commit_stats":null,"previous_names":["cyberlife-coder/velesdb"],"tags_count":70,"template":false,"template_full_name":null,"purl":"pkg:github/cyberlife-coder/VelesDB","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyberlife-coder%2FVelesDB","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyberlife-coder%2FVelesDB/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyberlife-coder%2FVelesDB/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyberlife-coder%2FVelesDB/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cyberlife-coder","download_url":"https://codeload.github.com/cyberlife-coder/VelesDB/tar.gz/refs/heads/develop","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyberlife-coder%2FVelesDB/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32469346,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"ssl_error","status_checked_at":"2026-04-30T13:12:06.837Z","response_time":57,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-memory","all-in-one-databse","columnstore-database","embeddings","graph-database","hnsw","local-first","machine-learning","rag","rust","search-engine","vector-database"],"created_at":"2025-12-29T17:06:47.755Z","updated_at":"2026-04-30T20:01:11.568Z","avatar_url":"https://github.com/cyberlife-coder.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"velesdb_icon_pack/favicon/android-chrome-512x512.png\" alt=\"VelesDB Logo\" width=\"200\"/\u003e\n\u003c/p\u003e\n\u003ch1 align=\"center\"\u003e\n  \u003cimg src=\"velesdb_icon_pack/favicon/favicon-32x32.png\" alt=\"VelesDB\" width=\"32\" height=\"32\" style=\"vertical-align: middle;\"/\u003e\n\u003c/h1\u003e\n\u003ch3 align=\"center\"\u003e\n  Your AI agents forget everything. VelesDB fixes that.\n\u003c/h3\u003e\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eOne 6 MB binary. Three engines. One query language. Zero cloud dependency.\u003c/strong\u003e\u003cbr/\u003e\n  \u003cem\u003eVector + Graph + ColumnStore — unified under \u003ca href=\"docs/VELESQL_SPEC.md\"\u003eVelesQL\u003c/a\u003e\u003c/em\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/cyberlife-coder/VelesDB/actions/workflows/ci.yml\"\u003e\u003cimg src=\"https://github.com/cyberlife-coder/VelesDB/actions/workflows/ci.yml/badge.svg\" alt=\"CI\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://app.codacy.com/gh/cyberlife-coder/VelesDB/dashboard?utm_source=gh\u0026utm_medium=referral\u0026utm_content=\u0026utm_campaign=Badge_grade\"\u003e\u003cimg src=\"https://app.codacy.com/project/badge/Grade/58c73832dd294ba38144856ae69e9cf2\" alt=\"Codacy Badge\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://crates.io/crates/velesdb-core\"\u003e\u003cimg src=\"https://img.shields.io/crates/v/velesdb-core.svg\" alt=\"Crates.io\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://crates.io/crates/velesdb-core\"\u003e\u003cimg src=\"https://img.shields.io/crates/d/velesdb-core.svg\" alt=\"Crates.io Downloads\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/velesdb/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/velesdb.svg\" alt=\"PyPI\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://www.npmjs.com/package/@wiscale/velesdb-sdk\"\u003e\u003cimg src=\"https://img.shields.io/npm/v/@wiscale/velesdb-sdk.svg\" alt=\"npm\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://app.codacy.com/gh/cyberlife-coder/VelesDB/dashboard\"\u003e\u003cimg src=\"https://app.codacy.com/project/badge/Coverage/58c73832dd294ba38144856ae69e9cf2\" alt=\"Coverage\"\u003e\u003c/a\u003e\n  \u003cimg src=\"https://img.shields.io/badge/tests-7634_(Rust%2BTS%2BPy)-brightgreen\" alt=\"Tests\"\u003e\n  \u003ca href=\"https://github.com/cyberlife-coder/VelesDB/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-VelesDB_Core_1.0-blue\" alt=\"License\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/cyberlife-coder/VelesDB\"\u003e\u003cimg src=\"https://img.shields.io/github/stars/cyberlife-coder/VelesDB?style=flat-square\" alt=\"Stars\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://img.shields.io/badge/contributors-welcome-brightgreen\"\u003e\u003cimg src=\"https://img.shields.io/badge/contributors-welcome-brightgreen\" alt=\"Contributors Welcome\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/cyberlife-coder/VelesDB/releases/latest\"\u003eDownload latest release\u003c/a\u003e \u0026bull;\n  \u003ca href=\"#getting-started-in-60-seconds\"\u003eQuick Start\u003c/a\u003e \u0026bull;\n  \u003ca href=\"ARCHITECTURE.md\"\u003eArchitecture\u003c/a\u003e \u0026bull;\n  \u003ca href=\"ROADMAP.md\"\u003eRoadmap\u003c/a\u003e \u0026bull;\n  \u003ca href=\"QUALITY_BAR.md\"\u003eQuality Bar\u003c/a\u003e \u0026bull;\n  \u003ca href=\"https://velesdb.com/en/\"\u003eDocumentation\u003c/a\u003e \u0026bull;\n  \u003ca href=\"https://deepwiki.com/cyberlife-coder/VelesDB\"\u003eDeepWiki\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\n\u003e **Every AI agent today stitches together 3 databases for memory — vectors for \"what feels similar\", a graph for \"what is connected\", and SQL for \"what I know for sure\". That's 3 deployments, 3 configs, 3 query languages, and a pile of glue code.**\n\u003e\n\u003e **VelesDB replaces all of that with a single Rust binary that fits on a floppy disk.**\n\n---\n\n## The Story Behind VelesDB\n\nVelesDB was born in France out of a simple observation: **EU data sovereignty is an architectural problem, not a legal one.**\n\nThe US Cloud Act, FISA 702, and PATRIOT Act give US authorities multiple legal paths to reach data held by any US company — regardless of where the servers are. Hosting on AWS `eu-west-1` is a latency decision, not a sovereignty decision. The EU's Data Privacy Framework has been invalidated twice (Schrems I, Schrems II), and a third challenge is pending.\n\nFor European developers building AI agents that handle health data, legal documents, or financial records, the typical 2026 stack sends embeddings to Pinecone (US), graphs to Neo4j Aura (US), and metadata to PostgreSQL on AWS (US provider). Every one of these is reachable by a FISA warrant.\n\nVelesDB removes the US provider from the chain entirely. One Rust binary, local-first by design. No API key, no cloud account, no data processor. Your data stays in a directory you control — on your laptop, your server, your jurisdiction.\n\n\u003e [Read the full story: \"I built a database in France because the Cloud Act makes EU data sovereignty impossible\"](https://dev.to/wiscale-fr/i-built-a-database-in-france-because-the-cloud-act-makes-eu-data-sovereignty-impossible-5325)\n\n---\n\n## Why VelesDB?\n\n| Today (3 systems to maintain) | With VelesDB (1 binary) |\n|-------------------------------|------------------------|\n| pgvector for embeddings | **Vector Engine** — 450us p50 end-to-end (10K/384D, WAL ON, recall\u003e=96%) |\n| Neo4j for knowledge graphs | **Graph Engine** — MATCH clause, BFS/DFS |\n| PostgreSQL/DuckDB for metadata | **ColumnStore** — 130x faster than JSON at 100K rows |\n| Custom glue code + 3 query languages | **VelesQL** — one language for everything |\n| 3 deployments, 3 configs, 3 backups | **6 MB binary** — works offline, air-gapped |\n\n---\n## What is VelesDB?\n\nVelesDB is a **local-first database for AI agents** that fuses three engines into a single 6 MB binary:\n\n| Engine | What it does | Performance |\n|--------|-------------|-------------|\n| **Vector** | Semantic similarity search (HNSW + AVX2/NEON SIMD) | **450us** p50 end-to-end (384D, WAL ON, recall\u003e=96%) [1] |\n| **Graph** | Knowledge relationships (BFS/DFS, edge properties) | Native **MATCH** clause |\n| **ColumnStore** | Structured metadata filtering (typed columns) | **130x** faster than JSON scanning [2] |\n\n\u003e [1] Reproduce: `python benchmarks/velesdb_benchmark.py --recall` (Python SDK path, 10K/384D, WAL fsync on, i9-14900KF reference machine). See [docs/BENCHMARKS.md](docs/BENCHMARKS.md) and [CHANGELOG v1.13.0](CHANGELOG.md).\n\u003e [2] Reproduce: `cargo bench -p velesdb-core --bench filter_benchmark`. See [docs/BENCHMARKS.md § 6](docs/BENCHMARKS.md) — at 100K rows: ColumnStore 29.5 us vs JSON scan 3.84 ms (integer equality filter).\n\nAll three are queried through **VelesQL** — a single SQL-like language with vector, graph, and columnar extensions:\n\n```sql\nMATCH (doc:Document)-[:AUTHORED_BY]-\u003e(author:Person)\nWHERE similarity(doc.embedding, $question) \u003e 0.8\n  AND author.department = 'Engineering'\nRETURN author.name, doc.title\nORDER BY similarity() DESC LIMIT 5\n```\n\n**Built-in Agent Memory SDK** provides semantic, episodic, and procedural memory for AI agents — no external services needed.\n\n\u003e **One binary. No cloud. No glue code. Runs on server, browser, mobile, and desktop.**\n\n---\n\n## Agent Memory SDK\n\nBuilt-in memory for AI agents — semantic, episodic, and procedural. No external services needed.\n\n```python\nfrom velesdb import Database, AgentMemory\n\ndb = Database(\"./agent_data\")\nmemory = AgentMemory(db, dimension=384)\n\nmemory.semantic.store(1, \"Paris is the capital of France\", embedding)\nmemory.episodic.record(1, \"User asked about geography\", timestamp, embedding)\nmemory.procedural.learn(1, \"answer_geography\", steps, embedding, confidence=0.8)\n```\n\n| Feature | API |\n|---------|-----|\n| TTL / Auto-expiration | `store_with_ttl()`, `auto_expire()` |\n| Snapshots / Rollback | `snapshot()`, `load_latest_snapshot()` |\n| Reinforcement | `reinforce(success=True)` — 4 strategies |\n\n\u003e **Full guide:** [docs/guides/AGENT_MEMORY.md](docs/guides/AGENT_MEMORY.md) | [Source code](crates/velesdb-core/src/agent/)\n\n---\n\n## Quick Comparison\n\n| | **VelesDB** | Chroma | Qdrant | pgvector |\n|---|---|---|---|---|\n| **Architecture** | Unified vector + graph + columnar | Vector only | Vector + payload | Vector extension for PostgreSQL |\n| **Metadata filtering** | **ColumnStore (130x vs JSON)** | JSON scan | JSON payload | SQL (PostgreSQL) |\n| **Deployment** | Embedded / Server / WASM / Mobile | Server (Python) | Server (Rust) | Requires PostgreSQL |\n| **Binary size** | 6 MB | ~500 MB (with deps) | ~50 MB | N/A (PG extension) |\n| **Search latency** | **450us** p50 (10K/384D, WAL ON, recall\u003e=96%) | ~1-5ms | ~1-5ms (in-memory) | ~5-20ms |\n| **Graph support** | Native (MATCH clause) | No | No | No |\n| **Query language** | VelesQL (SQL + NEAR + MATCH) | Python API | JSON API / gRPC | SQL + operators |\n| **Browser (WASM)** | Yes | No | No | No |\n| **Mobile (iOS/Android)** | Yes | No | No | No |\n| **Offline / Local-first** | Yes | Partial | No | No |\n\n\u003e *Competitor latencies are typical ranges from public benchmarks and vendor documentation. Direct comparison is approximate — architectures differ (embedded vs client-server, durable vs in-memory, recall levels). Run your own benchmarks for accurate comparison.*\n\n\u003e **VelesDB's sweet spot:** When you need vector + graph + structured filtering in a single engine, local-first deployment, or a lightweight binary that runs anywhere.\n\u003e\n\u003e **Not the best fit (yet):** If you need a managed cloud service with a multi-node distributed cluster.\n\n---\n\n## Known Limitations\n\nVelesDB is honest about its boundaries. The following are current scope limits of the open-source Community Edition — each is either a deliberate design trade-off or a feature tracked for a separate Enterprise edition. We list them here so you can make an informed technical choice.\n\n| # | Limitation | Scope | Tracked |\n|---|------------|-------|---------|\n| 1 | **Single writer per collection** — WAL is serialized; concurrent writers contend on the same fsync lock. | Design trade-off (local-first, crash-safe by default). Read throughput is unaffected. | Concurrent WAL writer is planned for the Enterprise edition (separate product, not yet public). See [docs/CONCURRENCY_MODEL.md](docs/CONCURRENCY_MODEL.md). |\n| 2 | **No distributed replication** — VelesDB is single-node. No Raft, no sharding, no automatic failover in Core. | Deliberate: the sweet spot is local-first / embedded. | Raft-based replication is tracked internally for the Enterprise edition. Contact us for timeline. |\n| 3 | **No advanced RBAC / multi-tenant isolation** — The `DatabaseObserver` hook is shipped (Core) and can be wired to a homegrown RBAC layer, but a production-grade RBAC/audit implementation is not in Core. | Core ships the hook, not the policy engine. | Enterprise feature. |\n| 4 | **WASM MATCH limited to 2 hops** — The browser build of `velesdb-wasm` supports 1- and 2-hop graph `MATCH` patterns today. 3+ hop `MATCH` works fully in native builds (server / Python / mobile / CLI) via `velesdb-core`. | Scope of Sprint 4 item S4-13. | Tracked, not a correctness issue — native path already supports full traversal. |\n| 5 | **SIFT1M benchmark fingerprints — pinning workflow ships, sidecar not yet committed** — The loader reads its pinned SHA-256 hashes from `benches/datasets/sift1m_fingerprints.json` when present (strict mode, mismatch fails the bench). Until a maintainer runs `cargo bench -p velesdb-core --features bench-sift1m --bench capture_sift1m_fingerprints` on the reference machine and commits the generated sidecar, the loader falls back to TOFU mode (prints the observed SHA-256 and proceeds). | Not a correctness issue — `check_shape` still validates row count and dimension. The one-command bootstrap closes the integrity gap in a single run. | One-command bootstrap shipped; sidecar commit pending first reference-machine run. |\n| 6 | **No head-to-head Docker Compose benchmark vs Qdrant / Chroma / FAISS yet** — The SIFT1M benchmark (new in v1.13.0) is the standardized cross-implementation comparable number and matches the dataset used by every major ANN paper. A one-shot Docker Compose harness that runs all four systems on the same machine is deferred until the benchmark infrastructure stabilizes. | Transparency: side-by-side numbers require infrastructure we have not frozen yet. | Tracked; SIFT1M already gives comparable recall@10 numbers against the literature. |\n\nNone of the above is a correctness gap — the Community Edition is production-ready for single-node, local-first deployments. The items above are feature-scope boundaries, not bugs.\n\nFor **internal technical limitations** (query-planner approximations, plan cache semantics around `ANALYZE`, CBO integration status), see [`docs/reference/KNOWN_LIMITATIONS.md`](docs/reference/KNOWN_LIMITATIONS.md) — each entry is tracked by a GitHub issue or documented as an explicit approximation with regression tests.\n\n---\n\n## Getting Started in 60 Seconds\n\n### Install\n\n**Cargo (Rust):**\n```bash\ncargo install velesdb-server velesdb-cli\n```\n\n**Python:**\n```bash\npip install velesdb\n```\n\n**Docker:**\n```bash\n# Build the image locally\ngit clone https://github.com/cyberlife-coder/VelesDB.git \u0026\u0026 cd VelesDB\ndocker build -t velesdb .\n\n# Run with persistent data (named volume)\ndocker run -d -p 8080:8080 -v velesdb_data:/data --name velesdb velesdb\n\n# Verify it's running\ncurl http://localhost:8080/health\n```\nData is stored in the `/data` directory inside the container. The named volume `velesdb_data` persists data across container restarts. The built-in health check polls `GET /health` every 30 seconds.\n\n\u003cdetails\u003e\n\u003csummary\u003eMore install options (Docker Compose, WASM, install scripts)\u003c/summary\u003e\n\n**Docker Compose:**\n```bash\ngit clone https://github.com/cyberlife-coder/VelesDB.git \u0026\u0026 cd VelesDB\ndocker-compose up -d\n```\n\n| Environment variable | Default | Description |\n|---|---|---|\n| `VELESDB_DATA_DIR` | `/data` | Data storage directory |\n| `VELESDB_HOST` | `0.0.0.0` | Bind address |\n| `VELESDB_PORT` | `8080` | HTTP port |\n| `RUST_LOG` | `info` | Log level (`debug`, `info`, `warn`, `error`) |\n\n**WASM (Browser):**\n```bash\nnpm install @wiscale/velesdb-wasm\n```\n\n**Install script (Linux/macOS):**\n```bash\ncurl -fsSL https://raw.githubusercontent.com/cyberlife-coder/VelesDB/main/scripts/install.sh | bash\n```\n\n**Install script (Windows PowerShell):**\n```powershell\nirm https://raw.githubusercontent.com/cyberlife-coder/VelesDB/main/scripts/install.ps1 | iex\n```\n\n\u003c/details\u003e\n\n### First search in 30 seconds\n\n```bash\nvelesdb-server --data-dir ./my_data \u0026\n\n# Create collection + insert + search\ncurl -X POST http://localhost:8080/collections \\\n  -d '{\"name\": \"docs\", \"dimension\": 4, \"metric\": \"cosine\"}' -H \"Content-Type: application/json\"\n\ncurl -X POST http://localhost:8080/collections/docs/points \\\n  -d '{\"points\": [\n    {\"id\": 1, \"vector\": [1.0, 0.0, 0.0, 0.0], \"payload\": {\"title\": \"AI Intro\", \"category\": \"tech\"}},\n    {\"id\": 2, \"vector\": [0.0, 1.0, 0.0, 0.0], \"payload\": {\"title\": \"ML Basics\", \"category\": \"tech\"}},\n    {\"id\": 3, \"vector\": [0.0, 0.0, 1.0, 0.0], \"payload\": {\"title\": \"History of Computing\", \"category\": \"history\"}}\n  ]}' -H \"Content-Type: application/json\"\n\ncurl -X POST http://localhost:8080/collections/docs/search \\\n  -d '{\"vector\": [0.9, 0.1, 0.0, 0.0], \"top_k\": 2}' -H \"Content-Type: application/json\"\n# [{\"id\":1,\"score\":0.995,\"payload\":{\"title\":\"AI Intro\",\"category\":\"tech\"}}, ...]\n```\n\n\u003e Full installation guide: [docs/guides/INSTALLATION.md](docs/guides/INSTALLATION.md)\n\n---\n\n## Vector Engine\n\nNative HNSW index with SIMD-accelerated distance kernels. Sub-millisecond search on modern x86_64 hardware.\n\n| Metric | Value |\n|--------|-------|\n| Search p50 (10K, 384D, WAL ON) | **450 us** |\n| SIMD Dot Product (768D, AVX2) | **21.7 ns** |\n| Recall@10 (Balanced) | **98.8%** |\n| Quantization | SQ8 (4x), PQ (32x), Binary (32x), RaBitQ (32x) |\n\n5 search quality modes (Fast → Perfect), adaptive two-phase ef, AutoTune.\n\n\u003cdetails\u003e\n\u003csummary\u003eDetailed benchmarks and search modes\u003c/summary\u003e\n\n\u003e **Headline number** (canonical, full path): **450 us p50** end-to-end search (10K/384D, WAL ON, recall\u003e=96%) — see Performance summary above.\n\u003e\n\u003e The table below reports **index-only micro-benchmarks** (no WAL, no metadata fetch, hot cache) for components that can be measured in isolation. They are not directly comparable to end-to-end latency.\n\n| Component micro-benchmark | Result | How to reproduce |\n|-----------|--------|------------------|\n| HNSW Search index-only (5K/768D, k=10) | **55 us** | `cargo bench -p velesdb-core --bench hnsw_benchmark -- hnsw_search_latency` |\n| SIMD Dot Product kernel (768D, AVX2) | **21.7 ns** | `cargo bench -p velesdb-core --bench simd_benchmark` |\n| Recall@10 (Accurate mode) | **100%** | `cargo bench -p velesdb-core --bench recall_benchmark` |\n| BM25 Sparse Search index-only (10K docs, top-10) | **57.6 us** (16x from 956 us in v1.12) | `cargo bench -p velesdb-core --bench sparse_benchmark -- top10_10k_corpus` |\n\n| Mode | ef_search | Recall@10 | Use case |\n|------|-----------|-----------|----------|\n| Fast | 64 | 92.2% | Real-time suggestions, typeahead |\n| Balanced (default) | 128 | 98.8% | Production search, RAG pipelines |\n| Accurate | 512 | 100% | Evaluation, ground truth comparison |\n\n*Measurements sourced from `benchmarks/results/pr363_365_comparison.md` (i9-14900KF, 64 GB DDR5, Windows 11, `--release`, `target-cpu=native`). Windows micro-benchmarks carry 5-10% noise — expect a range, not a single point.*\n\n\u003c/details\u003e\n\n### Distance Metrics\n\n5 metrics with SIMD acceleration (AVX-512, AVX2, NEON, WASM SIMD128):\n\n| Metric | What it measures | Use case | SIMD perf (768D) |\n|--------|-----------------|----------|------------------|\n| **Cosine** | Angle between vectors (direction similarity) | Text embeddings (BERT, OpenAI, Cohere), normalized vectors | 33 ns |\n| **Euclidean** | Straight-line distance (L2 norm) | Image features, spatial data, when magnitude matters | 20 ns |\n| **Dot Product** | Inner product (projection) | Pre-normalized vectors, Maximum Inner Product Search (MIPS) | 22 ns |\n| **Hamming** | Bit differences in binary vectors | Binary embeddings, locality-sensitive hashing (LSH), fingerprints | 36 ns |\n| **Jaccard** | Set overlap (intersection / union) | Sparse vectors, tag similarity, set membership | 35 ns |\n\n```sql\n-- Choose metric at collection creation\nCREATE COLLECTION docs (dimension = 768, metric = 'cosine');\nCREATE COLLECTION images (dimension = 512, metric = 'euclidean');\nCREATE COLLECTION fingerprints (dimension = 256, metric = 'hamming');\n```\n\n```sql\nSELECT * FROM docs WHERE vector NEAR $v AND category = 'tech' LIMIT 5\n```\n\n- **SIFT1M standardized ANN benchmark** — measured on the de-facto-standard INRIA TEXMEX dataset (1M × 128D vectors, L2 metric). See [docs/BENCHMARKS.md § 11](docs/BENCHMARKS.md#11-sift1m--standard-ann-benchmark) for methodology, dataset provenance, and how to reproduce.\n\n\u003e **Full benchmarks and methodology:** [docs/BENCHMARKS.md](docs/BENCHMARKS.md) | [velesdb-benchmarks repo](https://github.com/cyberlife-coder/velesdb-benchmarks) | **Quantization guide:** [docs/guides/QUANTIZATION.md](docs/guides/QUANTIZATION.md)\n\n---\n\n## Graph Engine\n\nProperty graph with BFS/DFS traversal, edge labels, and Cypher-inspired MATCH queries — integrated with vector search.\n\n```sql\n-- Vector + Graph fusion in ONE statement\nMATCH (doc:Document)-[:AUTHORED_BY]-\u003e(author:Person)\nWHERE similarity(doc.embedding, $question) \u003e 0.8\nRETURN author.name, doc.title\nORDER BY similarity() DESC LIMIT 5\n```\n\nCross-collection MATCH with `@collection` annotation:\n\n```sql\nMATCH (p:Product@products)-[:STORED_IN]-\u003e(inv:Inventory@inventory)\nRETURN p.name, inv.price, inv.stock\nLIMIT 20\n```\n\n\u003e **Graph patterns guide:** [docs/guides/GRAPH_PATTERNS.md](docs/guides/GRAPH_PATTERNS.md)\n\n---\n\n## ColumnStore Engine\n\nTyped columnar storage — the same approach DuckDB and ClickHouse use. **130x faster** than JSON scanning at 100K rows.\n\n```\nJSON scan: 3.84 ms @ 100K    →    ColumnStore: 29.5 us @ 100K (130x faster)\n```\n\n```sql\nSELECT * FROM products\nWHERE vector NEAR $query AND in_stock = true AND price \u003c 50.0\nLIMIT 10\n```\n\nPre-filter or post-filter automatically optimized by the query planner.\n\n---\n\n## Use Cases\n\n### AI Agent Memory\n\nYour agent needs to remember conversations, learn from mistakes, and recall relevant knowledge. VelesDB provides all three memory types in a single embedded database — no Redis, no Pinecone, no Neo4j.\n\n```python\nmemory = AgentMemory(db, dimension=384)\nmemory.semantic.store(1, \"User prefers dark mode\", embedding)\nmemory.episodic.record(2, \"User asked about billing\", timestamp, embedding)\nmemory.procedural.learn(3, \"handle_refund\", steps, embedding, confidence=0.9)\n```\n\n### RAG with Metadata Filtering\n\nVector search alone returns noise. VelesDB's ColumnStore filters eliminate irrelevant results 130x faster than JSON scanning.\n\n```sql\nSELECT * FROM docs\nWHERE vector NEAR $query AND department = 'engineering' AND updated_at \u003e NOW() - INTERVAL '30 days'\nLIMIT 10\n```\n\n### E-commerce: Vector + Graph + Filters in One Query\n\nFind products similar to a query, filter by price/stock, and traverse co-purchase relationships — all in a single VelesQL statement.\n\n```sql\nMATCH (product)-[:BOUGHT_TOGETHER]-\u003e(related)\nWHERE similarity(product.embedding, $query) \u003e 0.7\n  AND related.price \u003c 200 AND related.in_stock = true\nRETURN related.name, related.price\nORDER BY similarity() DESC LIMIT 20\n```\n\n### Desktop \u0026 Mobile AI\n\nShip AI features without a server. VelesDB embeds directly into Tauri, iOS, and Android apps.\n\n| Platform | Integration | Binary size |\n|----------|-------------|-------------|\n| Desktop (Tauri) | `tauri-plugin-velesdb` | 6 MB |\n| iOS (Swift) | UniFFI bindings | ~4 MB |\n| Android (Kotlin) | UniFFI bindings | ~4 MB |\n| Browser | WASM module | ~50 KB gzipped |\n\n---\n\n## Roadmap\n\n| Milestone | Status |\n|-----------|--------|\n| v1.0 — Core engine (vector + graph + VelesQL) | ✅ Shipped |\n| v1.5 — Python SDK, WASM, Mobile bindings | ✅ Shipped |\n| v1.10 — Agent Memory SDK, hybrid search, quantization | ✅ Shipped |\n| v1.11 — Cross-collection MATCH, bitmap pre-filter, CSR graph | ✅ Shipped |\n| v1.12 — Cross-collection MATCH (graph/BM25/HNSW hybrids), Sprint 4 Phase B (TS SDK stability) | ✅ Shipped |\n| v1.13 — Pre-seed remediation: BM25 O(1) cold-start, sparse search 16× speedup, HNSW prefetch, EXPLAIN/CBO routing, VelesQL window functions, SIFT1M standardized harness | ✅ Shipped |\n\n\u003e VelesDB Core is open-source. Enterprise features (distributed replication, managed cloud, RBAC) are available separately via [VelesDB Premium](https://velesdb.com).\n\n\u003e We ship weekly. [Full changelog](CHANGELOG.md) | [Contributing guide](CONTRIBUTING.md)\n\n---\n\n## Full Ecosystem\n\n| Domain | Component | Install |\n|--------|-----------|---------|\n| **Core** | [velesdb-core](crates/velesdb-core) — Vector + Graph + ColumnStore + VelesQL | `cargo add velesdb-core` |\n| **Server** | [velesdb-server](crates/velesdb-server) — REST API (46 endpoints, OpenAPI) | `cargo install velesdb-server` |\n| **CLI** | [velesdb-cli](crates/velesdb-cli) — Interactive VelesQL REPL | `cargo install velesdb-cli` |\n| **Python** | [velesdb-python](crates/velesdb-python) — PyO3 bindings + NumPy | `pip install velesdb` |\n| **TypeScript** | [typescript-sdk](sdks/typescript) — Node.js \u0026 Browser SDK | `npm install @wiscale/velesdb-sdk` |\n| **WASM** | [velesdb-wasm](crates/velesdb-wasm) — Browser-side vector search | `npm install @wiscale/velesdb-wasm` |\n| **Mobile** | [velesdb-mobile](crates/velesdb-mobile) — iOS (Swift) \u0026 Android (Kotlin) | [Build instructions](docs/guides/INSTALLATION.md#-mobile-iosandroid) |\n| **Desktop** | [tauri-plugin](crates/tauri-plugin-velesdb) — Tauri v2 AI-powered apps | `cargo add tauri-plugin-velesdb` |\n| **LangChain** | [langchain-velesdb](integrations/langchain) — Official VectorStore | [From source](integrations/langchain/README.md) |\n| **LlamaIndex** | [llamaindex-velesdb](integrations/llamaindex) — Document indexing | [From source](integrations/llamaindex/README.md) |\n| **Migration** | [velesdb-migrate](crates/velesdb-migrate) — From Qdrant, Pinecone, Supabase | `cargo install velesdb-migrate` |\n\n---\n\n## How VelesDB Works\n\n```\nINSERT                      INDEX                       SEARCH\n┌──────────┐  upsert   ┌──────────────┐  build   ┌──────────────┐\n│ Your App │──────────\u003e │ WAL (append) │────────\u003e │  HNSW Graph  │\n│          │           │ + mmap store │         │  (in-memory) │\n└──────────┘           └──────┬───────┘         └──────┬───────┘\n                              │                        │\n                       ┌──────▼───────┐                │ search\n                       │  ColumnStore  │  filter   ┌────▼─────────┐\n                       │ (typed cols)  │────────\u003e │ SIMD Distance│\n                       └──────────────┘          │(AVX-512/NEON)│\n                        RESULT                    └──────┬───────┘\n┌──────────┐  top-k    ┌──────────────┐  rank           │\n│ Your App │\u003c──────────│   Payload    │\u003c────────────────┘\n│          │           │  Hydration   │\n└──────────┘           └──────────────┘\n```\n\n**Key design choices:**\n- **Local-first**: In-process or single binary — no network hops, no cloud dependency\n- **Memory-mapped storage**: OS manages paging between RAM and disk\n- **WAL durability**: Every write is journaled. Crash-safe by default (`fsync` mode). Deferred sync during bulk insert for throughput\n- **ColumnStore**: Typed columns with string interning, RoaringBitmap tombstones, PostgreSQL-inspired auto-vacuum\n\n\u003cdetails\u003e\n\u003csummary\u003eDocker deployment\u003c/summary\u003e\n\n```bash\n# Build and run locally\ndocker build -t velesdb .\ndocker run -d -p 8080:8080 -v velesdb_data:/data --name velesdb velesdb\ncurl http://localhost:8080/health\n\n# Or with docker-compose (builds + auto-restart)\ndocker-compose up -d\n```\n\n| Variable | Default | Description |\n|---|---|---|\n| `VELESDB_DATA_DIR` | `/data` | Data storage directory |\n| `VELESDB_HOST` | `0.0.0.0` | Bind address |\n| `VELESDB_PORT` | `8080` | HTTP port |\n| `RUST_LOG` | `info` | Log level |\n\nThe container runs as a non-root `velesdb` user. Data persists via the named volume `velesdb_data`. A built-in health check (`GET /health`) is configured with a 30-second interval.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eAPI Reference (46 REST endpoints)\u003c/summary\u003e\n\n| Category | Key Endpoints |\n|----------|--------------|\n| **Collections** | `POST /collections`, `GET /collections`, `GET/DELETE /collections/{name}` |\n| **Points** | `/collections/{name}/points`, `/collections/{name}/points/scroll`, `/collections/{name}/stream/insert` |\n| **Search** | `/collections/{name}/search`, `/collections/{name}/search/batch`, `/collections/{name}/search/hybrid`, `/collections/{name}/search/text`, `/collections/{name}/search/multi`, `/collections/{name}/search/ids`, `/collections/{name}/match` |\n| **Graph** | `/collections/{name}/graph/edges`, `/collections/{name}/graph/edges/{id}`, `/collections/{name}/graph/edges/count`, `/collections/{name}/graph/traverse`, `/collections/{name}/graph/traverse/stream`, `/collections/{name}/graph/traverse/parallel`, `/collections/{name}/graph/nodes`, `/collections/{name}/graph/nodes/{id}/degree`, `/collections/{name}/graph/nodes/{id}/edges`, `/collections/{name}/graph/nodes/{id}/payload`, `/collections/{name}/graph/search` |\n| **Indexes** | `GET/POST /collections/{name}/indexes`, `DELETE /collections/{name}/indexes/{label}/{property}`, `/collections/{name}/index/rebuild` |\n| **VelesQL** | `/query`, `/aggregate`, `/query/explain` |\n| **Admin** | `/health`, `/ready`, `/metrics`, `/guardrails`, `/collections/{name}/stats`, `/collections/{name}/config`, `/collections/{name}/flush`, `/collections/{name}/analyze`, `/collections/{name}/empty`, `/collections/{name}/sanity` |\n\n\u003e **Full API reference:** [docs/reference/api-reference.md](docs/reference/api-reference.md) | **OpenAPI spec:** [docs/openapi.yaml](docs/openapi.yaml)\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eSecurity\u003c/summary\u003e\n\n- **API Key Authentication** — Bearer token auth via `VELESDB_API_KEYS` env var\n- **TLS (HTTPS)** — Built-in via rustls (`VELESDB_TLS_CERT` / `VELESDB_TLS_KEY`)\n- **Graceful Shutdown** — SIGTERM triggers connection drain + WAL flush. Zero data loss\n- **Health Endpoints** — `GET /health` and `GET /ready` always public\n\n\u003e [docs/guides/SERVER_SECURITY.md](docs/guides/SERVER_SECURITY.md)\n\n\u003c/details\u003e\n\n---\n\n## Demos \u0026 Examples\n\n```bash\ncd examples/ecommerce_recommendation \u0026\u0026 cargo run --release\n```\n\n| Demo | Description | Tech |\n|------|-------------|------|\n| [ecommerce_recommendation](examples/ecommerce_recommendation/) | Vector + Graph + ColumnStore (5K products) | Rust |\n| [rag-pdf-demo](demos/rag-pdf-demo/) | PDF document Q\u0026A with RAG | Python, FastAPI |\n| [tauri-rag-app](demos/tauri-rag-app/) | Desktop RAG application | Tauri v2, React |\n| [wasm-browser-demo](examples/wasm-browser-demo/) | In-browser vector search | WASM, vanilla JS |\n| [mini_recommender](examples/mini_recommender/) | Product recommendations | Rust |\n\n---\n\n\u003cdetails\u003e\n\u003csummary\u003eResearch Foundations\u003c/summary\u003e\n\nVelesDB's performance is built on peer-reviewed research — every technique is implemented and production-active.\n\n| Technique | Paper |\n|-----------|-------|\n| HNSW | [Malkov \u0026 Yashunin, 2016](https://arxiv.org/abs/1603.09320) |\n| VAMANA / DiskANN | [Subramanya et al., 2019](https://arxiv.org/abs/1907.05024) |\n| RaBitQ | [Gao \u0026 Long, 2024](https://arxiv.org/abs/2405.12497) |\n| Dual-Precision (VSAG) | [Xu et al., 2025](https://arxiv.org/abs/2503.17911) |\n| Software Pipelining | [Jiang et al., 2025](https://arxiv.org/abs/2505.07621) |\n| PDX Layout | [Pirk et al., 2025](https://arxiv.org/abs/2503.04422) |\n\n\u003c/details\u003e\n\n## Contributing\n\n```bash\ngit clone https://github.com/cyberlife-coder/VelesDB.git \u0026\u0026 cd VelesDB\ncargo test --workspace --features persistence,gpu,update-check --exclude velesdb-python -- --test-threads=1\n```\n\nLooking for a place to start? Check out issues labeled [`good first issue`](https://github.com/cyberlife-coder/VelesDB/labels/good%20first%20issue).\n\n---\n\n## Powered by VelesDB\n\n| Project | Use case |\n|---------|----------|\n| [WPLink](https://wplink.ai) | AI-powered semantic analysis to find and apply internal linking opportunities for WordPress sites |\n| *Your project here* | [Get listed →](mailto:contact@wiscale.fr?subject=VelesDB%20Showcase) |\n\n[![Built with VelesDB](https://img.shields.io/badge/Built_with-VelesDB-blue?style=flat-square)](https://github.com/cyberlife-coder/VelesDB)\n\nUsing VelesDB in production? Open a [GitHub Discussion](https://github.com/cyberlife-coder/VelesDB/discussions) or email [contact@wiscale.fr](mailto:contact@wiscale.fr) to get featured. Your feedback shapes the roadmap.\n\n---\n\n## License\n\nVelesDB Core License 1.0 (based on ELv2). Free for production use, including commercial applications. Two restrictions: no offering VelesDB as a hosted/managed database service, and no building a competing database product. [Read the full license](LICENSE).\n\n---\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eVelesDB\u003c/strong\u003e \u0026mdash; The Local Knowledge Engine for AI Agents\u003cbr/\u003e\n  \u003ca href=\"https://velesdb.com\"\u003evelesdb.com\u003c/a\u003e \u0026bull; \u003ca href=\"https://github.com/cyberlife-coder/VelesDB\"\u003eGitHub\u003c/a\u003e\n\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyberlife-coder%2Fvelesdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcyberlife-coder%2Fvelesdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyberlife-coder%2Fvelesdb/lists"}