{"id":48781592,"url":"https://github.com/zd87pl/loci-db","last_synced_at":"2026-04-13T14:35:29.978Z","repository":{"id":344179597,"uuid":"1180802495","full_name":"zd87pl/loci-db","owner":"zd87pl","description":"4D embeddings DB for world models","archived":false,"fork":false,"pushed_at":"2026-04-05T03:28:08.000Z","size":276,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-05T05:13:33.399Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zd87pl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-13T12:32:05.000Z","updated_at":"2026-04-05T03:28:05.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/zd87pl/loci-db","commit_stats":null,"previous_names":["zd87pl/engram-db","zd87pl/loci-db"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/zd87pl/loci-db","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zd87pl%2Floci-db","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zd87pl%2Floci-db/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zd87pl%2Floci-db/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zd87pl%2Floci-db/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zd87pl","download_url":"https://codeload.github.com/zd87pl/loci-db/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zd87pl%2Floci-db/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31757482,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-13T13:27:56.013Z","status":"ssl_error","status_checked_at":"2026-04-13T13:21:23.512Z","response_time":93,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-13T14:35:29.316Z","updated_at":"2026-04-13T14:35:29.948Z","avatar_url":"https://github.com/zd87pl.png","language":"Python","readme":"# LOCI\n\n**A 4D spatiotemporal vector database for AI world models.**\n\n[![CI](https://github.com/zd87pl/loci-db/actions/workflows/ci.yml/badge.svg)](https://github.com/zd87pl/loci-db/actions)\n[![PyPI version](https://img.shields.io/pypi/v/loci-stdb.svg)](https://pypi.org/project/loci-stdb/)\n[![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://www.python.org/downloads/)\n[![License: Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-green.svg)](LICENSE)\n\n---\n\n## The Problem\n\nModern world models — V-JEPA 2, DreamerV3, GAIA-1, UniSim — produce embeddings\nwhere every vector has an implicit **4D spatiotemporal address** `(x, y, z, t)`.\nExisting vector databases (Qdrant, Milvus, Weaviate) treat all embedding dimensions\nequally: a spatial query requires 3+ float-range payload filters evaluated\nindependently, time-based retrieval has no native sharding, and there is no\nconcept of \"predict the future then find what's nearby.\"\n\n## The Solution\n\nLOCI is a middleware layer on top of [Qdrant](https://qdrant.tech) that makes\nspatiotemporal structure **first-class** through three novel primitives:\n\n### 1. Multi-Resolution Hilbert Bucketing\n\nEncode `(x, y, z, t)` at multiple Hilbert resolutions (p=4, 8, 12).\nSpatial bounding-box queries use a Hilbert integer pre-filter with overlap, then\napply an exact payload post-filter as the authoritative geometric check. By\ndefault queries start at the coarsest indexed resolution; with `adaptive=True`,\ndense regions can be promoted to finer Hilbert resolutions at query time.\n\n```\n         Naive Qdrant               LOCI\n    ┌──────────────────┐     ┌──────────────────┐\n    │ x_min ≤ x ≤ x_max│     │                  │\n    │ y_min ≤ y ≤ y_max│ →   │ hilbert_r4 ∈ {…} │\n    │ z_min ≤ z ≤ z_max│     │  (single filter)  │\n    └──────────────────┘     └──────────────────┘\n```\n\n### 2. Temporal Sharding\n\nAutomatic routing of vectors to **time-partitioned Qdrant collections**\n(`loci_{epoch_id}`). Configurable epoch size. Queries fan out only to\nepochs that overlap the requested time window — with the async client,\nall shards are searched **concurrently** via `asyncio.gather`.\n\n### 3. Predict-then-Retrieve with Novelty Detection\n\nAn **atomic API call** that composes a user-supplied world model with\nvector search, returning both results and a **novelty score**:\n\n```python\nresult = client.predict_and_retrieve(\n    context_vector=current_embedding,\n    predictor_fn=my_world_model,\n    future_horizon_ms=2000,\n    current_position=(0.5, 0.3, 0.8),\n)\nprint(f\"Novelty: {result.prediction_novelty:.2f}\")\n# 0.0 = \"I've seen this before\"\n# 1.0 = \"This is new territory\"\n```\n\n## Quick Start\n\n### Quick Start with Docker\n\nThe fastest way to run LOCI with a persistent Qdrant backend:\n\n```bash\ndocker compose up\n```\n\nThis starts two services:\n- **loci** — the LOCI REST API on `http://localhost:8000`\n- **qdrant** — the Qdrant vector store on `http://localhost:6333`\n\nQdrant data is persisted in a named volume so it survives restarts.\n\nOnce running, insert and query world states via the HTTP API:\n\n```bash\n# Health check\ncurl http://localhost:8000/health\n\n# Insert a world state (512-dim vector)\ncurl -X POST http://localhost:8000/insert \\\n  -H 'Content-Type: application/json' \\\n  -d '{\"x\":0.5,\"y\":0.3,\"z\":0.8,\"timestamp_ms\":1700000000000,\"vector\":[0.1],\"scene_id\":\"s1\"}'\n\n# Query (spatial + time window)\ncurl -X POST http://localhost:8000/query \\\n  -H 'Content-Type: application/json' \\\n  -d '{\"vector\":[0.1],\"x_min\":0.0,\"x_max\":1.0,\"limit\":10}'\n```\n\nInteractive API docs: `http://localhost:8000/docs`\n\n---\n\n### No Docker? No problem — in-memory mode\n\nTry LOCI instantly with zero infrastructure using `LocalLociClient`:\n\n```bash\npip install loci-stdb          # or: pip install -e \".[dev]\"\n```\n\n```python\nfrom loci import LocalLociClient, WorldState\n\nclient = LocalLociClient(vector_size=512)\n\n# Insert a world state\nstate = WorldState(\n    x=0.5, y=0.3, z=0.8,\n    timestamp_ms=1000,\n    vector=[0.1] * 512,\n    scene_id=\"my_scene\",\n)\nstate_id = client.insert(state)\n\n# Query by vector similarity + spatial bounds + time window\nresults = client.query(\n    vector=[0.1] * 512,\n    spatial_bounds={\"x_min\": 0.0, \"x_max\": 1.0,\n                    \"y_min\": 0.0, \"y_max\": 1.0,\n                    \"z_min\": 0.0, \"z_max\": 1.0},\n    time_window_ms=(0, 5000),\n    limit=10,\n)\n```\n\n### With Qdrant (production)\n\n```bash\npip install loci-stdb\ndocker run -p 6333:6333 qdrant/qdrant\n```\n\n```python\nfrom loci import LociClient, WorldState\n\nclient = LociClient(\n    \"http://localhost:6333\",\n    vector_size=512,\n    epoch_size_ms=5000,\n    distance=\"cosine\",\n)\n\n# Insert world states\nstate = WorldState(\n    x=0.5, y=0.3, z=0.8,\n    timestamp_ms=1700000000000,\n    vector=[0.1] * 512,\n    scene_id=\"warehouse_sim\",\n    scale_level=\"patch\",\n)\nstate_id = client.insert(state)\n\n# Batch insert (truly batched — one Qdrant call per epoch)\nids = client.insert_batch(states)\n\n# Spatiotemporal query with overlap factor\nresults = client.query(\n    vector=query_embedding,\n    spatial_bounds={\"x_min\": 0.2, \"x_max\": 0.8,\n                    \"y_min\": 0.0, \"y_max\": 1.0,\n                    \"z_min\": 0.0, \"z_max\": 1.0},\n    time_window_ms=(start_ms, end_ms),\n    limit=10,\n    overlap_factor=1.2,  # 20% expanded search for boundary recall\n)\n\n# Predict-then-retrieve with novelty scoring\nresult = client.predict_and_retrieve(\n    context_vector=current_embedding,\n    predictor_fn=my_world_model,\n    future_horizon_ms=2000,\n    current_position=(0.5, 0.3, 0.8),\n)\n\n# Trajectory reconstruction via scroll API\ntrajectory = client.get_trajectory(state_id, steps_back=20, steps_forward=20)\n\n# Episodic context window\ncontext = client.get_causal_context(state_id, window_ms=5000)\n```\n\n### Async API (parallel shard fan-out)\n\n```python\nfrom loci import AsyncLociClient\n\nasync with AsyncLociClient(\n    \"http://localhost:6333\",\n    vector_size=512,\n    distance=\"cosine\",\n) as client:\n    await client.insert(state)\n    results = await client.query(vector=query_embedding, limit=10)\n```\n\n### World Model Adapters\n\n```python\nfrom loci.adapters.vjepa2 import VJEPA2Adapter\nfrom loci.adapters.dreamer import DreamerV3Adapter\nfrom loci.adapters.generic import GenericAdapter\n\n# V-JEPA 2\nadapter = VJEPA2Adapter()\nstates = adapter.batch_clip_to_states(clip_output, ts, scene_id)\n\n# DreamerV3\nadapter = DreamerV3Adapter()\nws = adapter.rssm_to_world_state(h_t, z_t, position, ts, scene_id)\n\n# Generic numpy/torch\nadapter = GenericAdapter(expected_dim=512)\nws = adapter.from_numpy(embedding, position, ts, scene_id)\n```\n\n## Performance\n\n**Raw spatiotemporal query latency: ~75µs p50** (label-filtered, 100 objects, 128-dim, Apple Silicon).\n\n| N objects | Query type | P50 | P99 |\n|--:|:--|--:|--:|\n| 100 | Label-filtered (demo path) | 75µs | 124µs |\n| 100 | Vector-only ANN | 212µs | 217µs |\n| 100 | Temporal shard pruning | 156µs | 188µs |\n| 500 | Label-filtered (demo path) | 259µs | 281µs |\n| 1,000 | Label-filtered (demo path) | 469µs | 514µs |\n| 1,000 | Vector-only ANN | 1.86ms | 2.08ms |\n\nInsert throughput: **~59,000 states/s** (in-memory backend, 128-dim vectors).\n\nRun the retrieval benchmark on your hardware:\n\n```bash\npython benchmarks/benchmark_retrieval.py\n```\n\nFor a LOCI-vs-naive-Qdrant comparison benchmark:\n\n```bash\n# In-memory (no Qdrant server needed):\npython benchmarks/vs_naive_qdrant.py\n\n# Against a live Qdrant server:\nQDRANT_URL=http://localhost:6333 python benchmarks/vs_naive_qdrant.py\n```\n\nResults are written to `benchmarks/results/` and printed as markdown tables.\n\n## Why not SpatCode?\n\nSpatCode (WWW 2026, arXiv 2601.09530) encodes coordinates into the embedding\nspace for soft/fuzzy retrieval via RoPE-style positional encoding. LOCI uses\nHilbert bucketing for **exact geometric range queries** with deterministic behavior.\n\n**Use SpatCode** when semantic proximity matters (e.g., \"find images taken\nnear this location\").\n\n**Use LOCI** when physical boundaries matter (e.g., \"find all observations\nwithin this 3D bounding box in the last 5 seconds\").\n\n## Why not TANNS?\n\nTANNS (ICDE 2025) builds a single graph managing all timestamps internally\nwith a Timestamp Graph structure. LOCI uses collection-level sharding with\nstorage tiering.\n\n**Use TANNS** for single-session temporal ANN where all data fits in one graph.\n\n**Use LOCI** when you need cross-session persistence, multi-agent memory sharing,\nhot/warm/cold storage tiering, or predict-then-retrieve.\n\n## Architecture\n\n```\n┌───────────────────────────────────────────────┐\n│              Application Layer                │\n│  LociClient / AsyncLociClient / LocalLociClient│\n│  insert · query · predict_and_retrieve        │\n├───────────────────────────────────────────────┤\n│              Retrieval Layer                  │\n│  predict.py — predict-then-retrieve + novelty │\n│  funnel.py  — multi-scale coarse→fine search  │\n├───────────────────────────────────────────────┤\n│           Indexing \u0026 Routing Layer            │\n│  spatial/  — multi-res Hilbert + overlap      │\n│  temporal/ — epoch sharding + decay scoring   │\n├───────────────────────────────────────────────┤\n│              Adapters Layer                   │\n│  V-JEPA 2 · DreamerV3 · Generic numpy/torch  │\n├───────────────────────────────────────────────┤\n│              Storage Layer                    │\n│  Qdrant (one collection per temporal epoch)   │\n│  MemoryStore (in-process, no infra needed)    │\n└───────────────────────────────────────────────┘\n```\n\nSee [ARCHITECTURE.md](ARCHITECTURE.md) for the full design document.\n\n## Documentation\n\n- [ARCHITECTURE.md](ARCHITECTURE.md) — System design\n- [docs/NOVELTY.md](docs/NOVELTY.md) — Novelty claims vs prior art\n- [docs/BENCHMARK_METHODOLOGY.md](docs/BENCHMARK_METHODOLOGY.md) — Benchmark replication guide\n- [docs/WORLD_MODEL_INTEGRATION.md](docs/WORLD_MODEL_INTEGRATION.md) — Integration guides\n\n## Development\n\n```bash\ngit clone https://github.com/zd87pl/loci-db.git\ncd loci-db\npip install -e \".[dev]\"\npytest tests/ -v\n\n# Linting \u0026 formatting (must pass in CI)\nruff check loci/ tests/\nruff format --check loci/ tests/\nmypy loci/\n```\n\n## Roadmap\n\nSee [ROADMAP.md](ROADMAP.md) for the v0.1 → v1.0 plan.\n\n## Citation\n\n```bibtex\n@misc{loci2026,\n  title={LOCI: A 4D Spatiotemporal Vector Database for AI World Models},\n  author={Dyras, Zygmunt},\n  year={2026},\n  url={https://github.com/zd87pl/loci-db}\n}\n```\n\n## License\n\nApache 2.0\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzd87pl%2Floci-db","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzd87pl%2Floci-db","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzd87pl%2Floci-db/lists"}