{"id":50831934,"url":"https://github.com/ccf/agentcairn","last_synced_at":"2026-06-14T00:01:49.845Z","repository":{"id":363964044,"uuid":"1263220935","full_name":"ccf/agentcairn","owner":"ccf","description":"Long-term, cross-project memory for AI coding agents. Stop redoing what you already solved. Your own Obsidian vault as the source of truth. Daemonless, non-lossy, no opaque databases.","archived":false,"fork":false,"pushed_at":"2026-06-11T04:33:33.000Z","size":946,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-11T06:11:41.003Z","etag":null,"topics":["agentmemory","ai","ai-agents","claude","claude-code","codex","cursor","harness","memory"],"latest_commit_sha":null,"homepage":"https://agentcairn.dev","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ccf.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-08T18:36:04.000Z","updated_at":"2026-06-11T04:33:36.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ccf/agentcairn","commit_stats":null,"previous_names":["ccf/agentcairn"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/ccf/agentcairn","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ccf%2Fagentcairn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ccf%2Fagentcairn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ccf%2Fagentcairn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ccf%2Fagentcairn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ccf","download_url":"https://codeload.github.com/ccf/agentcairn/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ccf%2Fagentcairn/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34304629,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-13T02:00:06.617Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentmemory","ai","ai-agents","claude","claude-code","codex","cursor","harness","memory"],"created_at":"2026-06-14T00:01:41.079Z","updated_at":"2026-06-14T00:01:49.826Z","avatar_url":"https://github.com/ccf.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🪨 agentcairn\n\n**Local-first memory for AI agents — that you can actually read, edit, and own.**\n\n\u003e **cairn** \u0026nbsp;/kɛən/\u0026nbsp; · *noun* — a stack of stones raised to mark a trail or a place worth remembering, left for whoever comes next.\n\nagentcairn gives your coding agent durable, high-quality memory — but instead of locking it in an opaque database or a cloud service, **your memories live as plain Markdown in an [Obsidian](https://obsidian.md) vault you own.** A fast, rebuildable [DuckDB](https://duckdb.org) index sits on top for retrieval. Open your vault, read what the agent remembered, fix a wrong fact by hand, or drop in your own notes — and the agent picks it all up.\n\n## Why agentcairn is different\n\nMost agent-memory systems make a database or cloud store the source of truth and treat files (if any) as a one-way export. agentcairn inverts that:\n\n- **📂 Your vault is the source of truth — not an export.** Memory is human-readable Markdown with frontmatter and `[[wikilinks]]`. Edit it in Obsidian; the index honors your edits.\n- **♻️ The index is disposable.** DuckDB is a rebuildable cache (`cairn reindex`). Your memory survives a model upgrade, a corrupted index, a schema change, or uninstalling the tool — **zero data loss**, because the truth is just files on disk.\n- **🧠 Non-lossy by construction.** The full note is always retained. Distillation only *adds* derived notes that link back to the source — it never silently drops facts it didn't think to extract at write time.\n- **🔒 Redaction before every write.** Secrets are scrubbed (regex + entropy + URL-credential detection) before anything — body, title, or tags — reaches the plaintext vault. We write files you can read, so we treat a leaked credential as the worst failure mode.\n- **🕸️ A free, deterministic knowledge graph.** Your `[[wikilinks]]` and frontmatter *are* the graph — no LLM extraction, no hallucinated entities.\n- **🪶 Daemonless, zero external DB.** One embedded DuckDB file does semantic vector search, BM25 full-text, and graph traversal. No always-on server, no Neo4j/Postgres/Qdrant, no required cloud key — just a `cairn` CLI and an on-demand MCP server.\n- **🔍 Honestly measured.** A reproducible LongMemEval-S + LoCoMo harness ships in [`benchmarks/`](benchmarks/) — with real numbers, ablations, and explicit caveats instead of one cherry-picked headline (see below).\n\n## Install\n\nThe easiest way to use agentcairn is the **[Claude Code](https://claude.com/claude-code) plugin** — one install wires up the MCP server, ambient memory (recall at session start, capture at session end), a memory skill, and slash commands:\n\n```bash\nclaude plugin marketplace add ccf/agentcairn\nclaude plugin install agentcairn@agentcairn\n```\n\nOn install you pick a vault path (default `~/agentcairn`); it's **auto-created** on the first session — no Obsidian setup required. From then on agentcairn surfaces relevant memory at the start of each session, distills each session into your vault, and gives you `/agentcairn:recall`, `/remember`, `/memory`, `/savings`, and `/ingest`. Nothing to pip-install — the plugin runs the published package via `uvx`.\n\n\u003e Not on Claude Code? agentcairn is also a standalone MCP server + CLI for any host — see [Using it directly](#using-it-directly).\n\n## How it works\n\n```mermaid\nflowchart LR\n    T[\"Session transcripts\u003cbr/\u003e(out-of-band)\"]\n    H[\"You · Obsidian\u003cbr/\u003e(hand edits)\"]\n    V[\"📂 Obsidian vault\u003cbr/\u003eMarkdown + frontmatter + wikilinks\u003cbr/\u003e\u003cb\u003esource of truth\u003c/b\u003e\"]\n    I[\"♻️ DuckDB index\u003cbr/\u003evector + BM25 + graph\u003cbr/\u003e\u003cb\u003erebuildable cache\u003c/b\u003e\"]\n    M[\"MCP tools\u003cbr/\u003eremember · recall · search · build_context · recent\"]\n\n    T -- \"redact → judge → distill → consolidate\" --\u003e V\n    H -- \"edit\" --\u003e V\n    V -- \"parse / reconcile-on-spawn\" --\u003e I\n    I -- \"READ_ONLY hybrid recall\" --\u003e M\n    M -. \"remember (redacted write)\" .-\u003e V\n\n    classDef truth fill:#eaf1ff,stroke:#317cff,color:#191919;\n    classDef cache fill:#f5f5f3,stroke:#999999,color:#191919;\n    class V truth\n    class I cache\n```\n\n- **Capture** reads your agent harness's session transcripts (append-only, already on disk) *out-of-band* — robust by design, with no fragile live hooks — then redacts → dedups → judges (semantic durability; optional LLM distillation via `CAIRN_JUDGE=anthropic`) → gates → distills into the vault, non-lossily. `cairn sweep` auto-detects every present harness (Claude Code and Codex are both supported, behind a `HarnessAdapter` seam) so you get unified memory across both without any extra configuration. On the LLM tier it also **consolidates**: a new memory that duplicates an existing one is skipped, and a newer version of an evolving fact marks the older note `superseded_by` (kept + demoted in recall, never deleted) — fail-safe, so a wrong call never drops a distinct memory (`CAIRN_CONSOLIDATE=0` to disable). Plus an agent-driven `remember` tool for curated, high-value memories.\n- **Retrieval** fuses BM25 + semantic vectors with Reciprocal Rank Fusion, applies an optional graph-boost, and **degrades gracefully** down to keyword-only when no embedding model is available — so recall is *never* silently dead. An optional cross-encoder reranker adds precision.\n- **Hybrid intelligence:** offline local embeddings (FastEmbed / `nomic-embed-text-v1.5` by default) out of the box — strong on its own *and* in the hybrid fusion (with `nomic`, vector-only edges out BM25 even on short turns; see the benchmark). Set `CAIRN_EMBED_MODEL` to pick another FastEmbed model, or run `CAIRN_EMBEDDER=ollama` / a cloud tier to go further.\n- **Temporal memory:** notes may carry `valid_from`/`valid_until`/`superseded_by` frontmatter. Recall is validity-aware — it soft-demotes superseded and expired facts (the *current* fact wins) without ever hiding them (non-lossy), and annotates each result's status (`current`/`superseded`/`expired`/`not_yet_valid`) plus an `as_of` anchor so the agent can reason over time. Inert for notes with no validity fields.\n\n## Using it directly\n\nThe plugin is the easiest path, but agentcairn is just a package — use it without Claude Code via the on-demand MCP server (for any MCP host) or the `cairn` CLI:\n\n```bash\nuvx agentcairn                                       # on-demand MCP server for any MCP host\ncairn ingest --vault ~/vault                         # distill recent agent sessions into the vault\ncairn sweep  --vault ~/vault                          # ingest + reindex in one pass (cron-friendly)\ncairn recall \"how did we fix the auth bug?\"          # hybrid recall from the CLI\ncairn savings                                        # how much context recall has saved you\ncairn reindex ~/vault                                # rebuild the index from Markdown (always safe)\ncairn doctor                                         # health-check the index\n```\n\n### Configuration\n\nAll settings live in one file — `~/.agentcairn/config.toml` — with env vars as overrides (precedence: CLI flag \u003e env var \u003e config file \u003e default):\n\n```bash\ncairn config --init   # scaffold a fully-commented template (chmod 600)\ncairn config          # show every setting's effective value and where it came from\n```\n\nFor example, enabling the LLM memory judge is two uncommented lines — no shell exports needed (the plugin's background sweep reads the file directly):\n\n```toml\njudge = \"anthropic\"\nanthropic_api_key = \"sk-ant-...\"\n```\n\n## Agents supported\n\nagentcairn works at two levels. **Claude Code** gets a first-class plugin — the full ambient loop (recall at session start, capture at session end), a memory skill, and slash commands. **Every other MCP host** gets the same recall/search/`remember` tools via the portable MCP server; `cairn install` wires it in non-destructively (your other servers are preserved, the original is backed up to `\u003cconfig\u003e.bak`). The vault stays a single global `~/agentcairn`, so memory is shared across every host.\n\n| Host | Support | Set up with | Ambient capture |\n|---|---|---|---|\n| **Claude Code** | 🟢 First-class plugin | `claude plugin install agentcairn@agentcairn` | ✅ recall-at-start + capture-at-end |\n| Cursor | 🔌 MCP server | `cairn install cursor` | — |\n| Claude Desktop | 🔌 MCP server | `cairn install claude-desktop` | — |\n| VS Code (Copilot) | 🔌 MCP server | `cairn install vscode` | — |\n| Gemini CLI | 🔌 MCP server | `cairn install gemini` | — |\n| Antigravity | 🔌 MCP server | `cairn install antigravity` | — |\n| Codex CLI | 🔌 MCP server | `cairn install codex` | — |\n| Any other MCP host | 🔌 MCP server | `uvx agentcairn` (paste the `cairn install … --print` snippet) | — |\n\n```bash\ncairn install                 # detect installed hosts + preview (writes nothing)\ncairn install cursor          # configure one host\ncairn install --all           # configure every detected host\ncairn install codex --print   # just print the snippet, change nothing\n```\n\nMost hosts take a JSON `mcpServers` entry (VS Code uses its `servers` key); Codex takes a TOML `[mcp_servers.agentcairn]` table (comments and other tables preserved). Ambient memory (auto recall-at-start, capture-at-end) is Claude-Code-only today — cross-host capture is tracked in [#36](https://github.com/ccf/agentcairn/issues/36).\n\n## Benchmarks measured\n\nWe benchmark agentcairn the way we'd want a memory system measured — **reproducibly, with ablations, and without a single cherry-picked headline number.** The harness ([`benchmarks/`](benchmarks/)) runs **LongMemEval-S** and **LoCoMo** through a version-pinned downloader (datasets are never vendored), scores retrieval deterministically (recall/nDCG@k, MRR — no API key needed, runs in CI on a synthetic fixture), and offers an opt-in LLM-judged QA layer.\n\n### Retrieval — LoCoMo\n\nFull LoCoMo set, turn-level, macro-avg, FastEmbed `nomic-embed-text-v1.5` (the default embedder):\n\n| arm | recall@5 | recall@10 | MRR |\n|---|---|---|---|\n| BM25 only | 0.527 | 0.604 | 0.459 |\n| vector only | 0.536 | 0.637 | 0.433 |\n| hybrid (RRF) | 0.562 | 0.648 | 0.477 |\n| hybrid + graph-boost | 0.562 | 0.648 | 0.477 |\n| **hybrid + reranker** | **0.662** | **0.735** | **0.608** |\n\nWhat we read from this — and say out loud:\n- **Hybrid beats either arm alone** — RRF fusion is worth it.\n- **The cross-encoder reranker is the biggest lever** (+0.10 recall@5 over hybrid); the \"ms-marco domain-shift might hurt\" worry didn't materialize on conversational data.\n- **The embedder default now pulls its weight** — with `nomic`, vector-only *edges out* BM25 (0.536 vs 0.527); switching from the old `bge-small` default (which trailed at 0.483) closed the gap. A 5-model FastEmbed sweep settled the pick — `nomic` (768-d) wins on quality-per-dim; bigger 1024-d models don't beat it. Full table: [`benchmarks/README.md`](benchmarks/README.md).\n- **graph-boost is inert on these corpora** — LoCoMo/LongMemEval have no native `[[wikilink]]` graph, so the boost has nothing to fire on. It's for *real interlinked vaults*, not chat logs, and we don't pretend otherwise.\n\n### Retrieval — LongMemEval-S\n\nFull 500-instance set — an easier task with well-separated evidence sessions. Session level is the granularity prior work reports; turn level is the finer, corpus-revealing slice:\n\n| arm | session r@5 | session MRR | turn r@5 | turn r@10 | turn MRR |\n|---|---|---|---|---|---|\n| BM25 only | 0.920 | 0.918 | 0.680 | 0.791 | 0.638 |\n| vector only | 0.936 | 0.916 | 0.507 | 0.692 | 0.454 |\n| hybrid (RRF) | 0.954 | 0.938 | 0.640 | 0.798 | 0.544 |\n| **hybrid + reranker** | **0.969** | **0.963** | **0.788** | **0.891** | **0.716** |\n\nRead honestly:\n- **Our 0.969 session recall@5 sits right alongside prior work's ≈0.95** over the same full 500-question set — and at full scale it *discriminates* (0.920 BM25 → 0.969 reranker) rather than saturating the way a small sample does.\n- **The reranker is again the biggest lever** — turn r@5 0.640 → 0.788, session r@5 0.954 → 0.969.\n- **Turn level is corpus-revealing:** here BM25-only (0.680) *beats* the RRF hybrid (0.640) because vector-only is weak on these single-turn evidence spans (0.507); the reranker is what pulls the default ahead. (Contrast LoCoMo, where vector-only edges out BM25.)\n\n### Context efficiency\n\nHow much smaller is the context agentcairn *recalls* than the full history you'd otherwise carry into the model? Default config (hybrid + reranker, k=10):\n\n| dataset | queries | mean haystack | mean recalled (k=10) | context reduction |\n|---|---|---|---|---|\n| LoCoMo (3 convos) | 497 | 25,646 tok | 529 tok | **51.1× mean / 50.3× median** |\n| LongMemEval-S (full 500) | 470 | 136,552 tok | 2,207 tok | **64.7× mean / 61.6× median** |\n\nEstimate (~4 chars/token), not a billed cost; \"haystack\" = the full indexed history, \"recalled\" = the top-k chunks returned. It measures context *size*, independent of retrieval quality.\n\n### QA accuracy\n\nQA-accuracy numbers (LLM-judged) are available too, but use an Anthropic judge rather than the papers' GPT-4o, so they are **not comparable to published leaderboards** — valid for relative ablation signal only. See [`benchmarks/README.md`](benchmarks/README.md) for how to run it and how to read the numbers.\n\n## Roadmap\n\n- **v1 — done.** The core loop: transcript ingestion → redaction → Markdown → rebuildable DuckDB index → hybrid recall; MCP server + CLI; secret redaction; local embeddings; reproducible benchmark harness.\n- **v1.1 — next, prioritized by the benchmark above:**\n  - ✅ **Reranker on by default** — the largest measured retrieval lever; `CAIRN_RERANK=0` to disable. *(shipped)*\n  - **Ollama embedding tier** — ✅ local models via `CAIRN_EMBEDDER=ollama` (`CAIRN_EMBED_MODEL`/`OLLAMA_HOST`); cloud (OpenAI/Voyage) still pending.\n  - ✅ **Bi-temporal validity** — frontmatter `valid_from`/`valid_until`/`superseded_by`; recall soft-demotes superseded/expired facts (non-lossy — never hidden) and annotates each result's currency + an `as_of` anchor, so the *current* fact wins and the agent can reason over time. *(shipped)*\n  - In-memory HNSW for large-vault retrieval latency.\n- **v2** — Obsidian plugin surface, MotherDuck cloud sync, optional LLM entity extraction.\n\n## Development\n\nagentcairn uses [uv](https://docs.astral.sh/uv/) exclusively for dependency management and tooling.\n\n**Do not use pip, poetry, or global virtual environments.**\n\n```bash\n# First-time setup\nuv sync                         # create .venv and install all deps (including dev)\nuv run pre-commit install       # install git hooks (ruff + pytest run on every commit)\n\n# Daily use\nuv run pytest                   # run the test suite\nuv run cairn --help             # run the CLI\nuvx agentcairn                  # run the installed tool ephemerally (as the MCP server does)\n\n# Formatting and linting\nuv run ruff format .            # format all Python files\nuv run ruff check --fix .       # lint with auto-fix\nuv run pre-commit run --all-files\n\n# Benchmarks (offline retrieval metrics need no API key)\nuv run pytest benchmarks/tests/                                      # offline synthetic-fixture suite\nPYTHONPATH=benchmarks uv run --group bench python -m cairn_bench.run --dataset locomo\n```\n\nThe MCP server is launched via `uvx agentcairn` — no global install required.\n\n## License\n\n[Apache License 2.0](LICENSE) — permissive, with an explicit patent grant. Copyright © 2026 Charles C. Figueiredo.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fccf%2Fagentcairn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fccf%2Fagentcairn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fccf%2Fagentcairn/lists"}