{"id":50726977,"url":"https://github.com/pbmagnet4/nlm-memory","last_synced_at":"2026-06-10T05:00:54.619Z","repository":{"id":359146146,"uuid":"1243659568","full_name":"pbmagnet4/nlm-memory","owner":"pbmagnet4","description":"Local-first non-linear memory OS for AI operators. One index across Claude Code, Codex, Cursor, Windsurf, Hermes, OpenCode, Aider, pi, and more — with an editable timeline.","archived":false,"fork":false,"pushed_at":"2026-06-10T02:43:16.000Z","size":2583,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-10T03:13:27.630Z","etag":null,"topics":["ai","aider","claude-code","codex","cursor","deepseek","hermes","local-first","mcp","memory","ollama","opencode","recall","session-memory","sqlite","typescript","windsurf"],"latest_commit_sha":null,"homepage":"https://www.npmjs.com/package/nlm-memory","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pbmagnet4.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-19T14:42:57.000Z","updated_at":"2026-06-10T02:40:34.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/pbmagnet4/nlm-memory","commit_stats":null,"previous_names":["pbmagnet4/nlm-memory-ts","pbmagnet4/nlm-memory"],"tags_count":30,"template":false,"template_full_name":null,"purl":"pkg:github/pbmagnet4/nlm-memory","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pbmagnet4%2Fnlm-memory","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pbmagnet4%2Fnlm-memory/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pbmagnet4%2Fnlm-memory/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pbmagnet4%2Fnlm-memory/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pbmagnet4","download_url":"https://codeload.github.com/pbmagnet4/nlm-memory/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pbmagnet4%2Fnlm-memory/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34137570,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-10T02:00:07.152Z","response_time":89,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","aider","claude-code","codex","cursor","deepseek","hermes","local-first","mcp","memory","ollama","opencode","recall","session-memory","sqlite","typescript","windsurf"],"created_at":"2026-06-10T05:00:34.204Z","updated_at":"2026-06-10T05:00:54.608Z","avatar_url":"https://github.com/pbmagnet4.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"assets/banner-dark.svg\" /\u003e\n    \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"assets/banner-light.svg\" /\u003e\n    \u003cimg alt=\"nlm-memory — Local-first non-linear memory OS\" src=\"assets/banner-light.svg\" width=\"720\" /\u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://www.npmjs.com/package/nlm-memory\"\u003e\u003cimg src=\"https://img.shields.io/npm/v/nlm-memory?color=CB3837\u0026label=npm\u0026logo=npm\" alt=\"npm version\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/pbmagnet4/nlm-memory/actions/workflows/ci.yml\"\u003e\u003cimg src=\"https://github.com/pbmagnet4/nlm-memory/actions/workflows/ci.yml/badge.svg?branch=main\" alt=\"CI status\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/pbmagnet4/nlm-memory/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/github/license/pbmagnet4/nlm-memory?color=blue\" alt=\"License: Apache 2.0\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://nodejs.org\"\u003e\u003cimg src=\"https://img.shields.io/node/v/nlm-memory?color=brightgreen\" alt=\"Node 20+\" /\u003e\u003c/a\u003e\n  \u003cimg src=\"https://img.shields.io/badge/tests-726%20passing-success\" alt=\"726 tests passing\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/MCP-9%20runtimes-8A2BE2\" alt=\"MCP across 9 runtimes\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/hooks-3%20runtimes-7B2CBF\" alt=\"Hooks on 3 runtimes\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/telemetry-none-informational\" alt=\"Zero telemetry\" /\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#install\"\u003eInstall\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#quick-start\"\u003eQuick Start\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#runtimes\"\u003eRuntimes\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#how-recall-works\"\u003eHow recall works\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#agent-self-improvement-signals\"\u003eSignals\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#mcp-tools\"\u003eMCP\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#rest-api\"\u003eREST API\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#daily-digest\"\u003eDigest\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#configuration\"\u003eConfig\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#security\"\u003eSecurity\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#vs-alternatives\"\u003evs Alternatives\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\n`nlm-memory` is a local-first memory layer for AI coding agents. It indexes every session from Claude Code, Codex, OpenCode, Cursor, Windsurf, Hermes, Aider, and pi into a single searchable store on your machine. Three properties no other memory layer ships together:\n\n1. **Cross-runtime reach.** One index, every adapter.\n2. **Editable timeline.** Sessions can be superseded by newer ones; entities can be retired. Patch history retroactively — no other tool lets you do this. See [docs/supersedence.md](docs/supersedence.md).\n3. **97.2% R@5 baseline.** On a 14-month corpus, keyword recall surfaces the right session in the top 5 on 97.2% of evaluator queries. No fine-tuning. The labels were generated by DeepSeek V4 — the retrieval algorithm is the same code path you'll run, but expect a lower number with a smaller local classifier. See [docs/methodology-recall-baseline.md](docs/methodology-recall-baseline.md).\n\nEverything stays on your machine by default. No telemetry, no account. The classifier defaults to local (Ollama); if you opt into a cloud classifier (DeepSeek, OpenAI, Anthropic, OpenRouter, or any OpenAI-compatible endpoint), session transcripts are sent to that provider — see [Security](#security) for the exact data-flow.\n\n---\n\n## Install\n\n```sh\nnpm install -g nlm-memory\nnlm setup\n```\n\n`nlm setup` is the interactive first-run wizard. It picks your classifier + model, wires the runtimes you actually use, generates an `NLM_MCP_TOKEN`, hardens permissions on `~/.nlm/`, and installs the daemon supervisor for your platform.\n\n| Platform | Daemon | Notes |\n|---|---|---|\n| **macOS** | LaunchAgent at `~/Library/LaunchAgents/com.github.pbmagnet4.nlm-memory.plist` | Auto-starts on login |\n| **Linux** | systemd user unit at `~/.config/systemd/user/nlm.service` | Headless servers: `loginctl enable-linger $USER` so the daemon survives logout |\n| **Windows** | Manual `nlm start` for now | Hook + MCP install paths are platform-aware; supervisor lands next release |\n\nStop or remove: `nlm uninstall`.\n\n---\n\n## Quick Start\n\nAfter `nlm setup` finishes, open **http://localhost:3940/ui** — the daemon is running. A 30-second sanity check:\n\n```sh\nnlm recall \"what was that pgvector decision\"   # one-shot search from the shell\nnlm digest                                      # yesterday's activity at a glance\nnlm --version\n```\n\n---\n\n## Runtimes\n\nOne corpus across every adapter. MCP works against all nine. **Automatic context injection via hooks** ships on four (Claude Code, Codex CLI, Hermes Agent, pi.dev). Cursor, Windsurf, and OpenCode receive an optional static rules-file nudge (`nlm connect \u003cruntime\u003e --with-rules`) that instructs the agent to call `recall_sessions` on history-flavored prompts — those runtimes don't expose a pre-prompt hook today, so a rules nudge is the closest equivalent. Aider is ingest-only. `nlm connect` wires whichever surface the runtime supports:\n\n| Runtime | Connect | Sessions read from | Hooks |\n|---|---|---|---|\n| **Claude Code** | `nlm connect claude-code` | `~/.claude/projects/**/*.jsonl` | 6 hooks: UserPromptSubmit, SessionStart, SessionEnd, Stop, PreCompact, SubagentStart |\n| **Codex CLI** | `nlm connect codex` | `~/.codex/sessions/` | Marketplace plugin (UserPromptSubmit + Stop) |\n| **Hermes** (WebUI) | `nlm connect hermes` | `~/.hermes/sessions/` | MCP only (writes the MCP server block to `~/.hermes/config.yaml`) |\n| **Hermes Agent** (NousResearch CLI) | `nlm connect hermes-agent` | `~/.hermes/state.db` | 6 hooks: pre_llm_call, post_llm_call, on_session_start/end/finalize/reset (Python plugin in `~/.hermes/plugins/nlm-memory/`) |\n| **Cursor** | `nlm connect cursor [--with-rules]` | Cursor IDE chat DB | MCP + optional rules nudge (workspace `.cursor/rules/nlm-recall.mdc`) |\n| **Windsurf** | `nlm connect windsurf [--with-rules]` | Windsurf user dir | MCP + optional rules nudge (`~/.codeium/windsurf/memories/global_rules.md`) |\n| **OpenCode** | `nlm connect opencode [--with-rules]` | `~/.local/share/opencode/` | MCP + optional rules nudge (`~/.config/opencode/AGENTS.md`) |\n| **Aider** | adapter active (ingest only) | `AIDER_CHAT_HISTORY_FILE` | No native MCP, no hooks — sessions are read into the corpus but Aider cannot call recall back |\n| **pi.dev** | `nlm setup` (auto) or `nlm connect pi` | `~/.pi/agent/sessions/**/*.jsonl` | 1 hook: input (prepend via transform) |\n\n`nlm disconnect \u003cruntime\u003e` reverses any of the above.\n\n---\n\n## How recall works\n\nTwo delivery paths. They share the same index.\n\n### 1. Hooks — automatic context injection\n\nHooks fire on user input and prepend a pointer block of likely-relevant prior sessions to the model's context. Four runtimes ship hooks today: Claude Code (six-hook lifecycle), Codex CLI (UserPromptSubmit + Stop via the marketplace plugin), Hermes Agent (six parallel hooks), and pi.dev (one `input` hook via [nlm/](nlm/README.md)). Cursor, Windsurf, and OpenCode don't expose a pre-prompt hook today, so the `--with-rules` install path drops a static rules snippet that asks the agent to call `recall_sessions` itself on history-flavored prompts (see [docs/hooks.md](docs/hooks.md) for the snippet). Full lifecycle, modes, logging surface, and the daily liveness canary documented in [docs/hooks.md](docs/hooks.md).\n\n**Claude Code** — six hooks written to `~/.claude/settings.json` via `nlm connect claude-code`:\n\n| Event | What NLM does | Mode |\n|---|---|---|\n| **UserPromptSubmit** | Score the prompt, silently prepend pointer block listing 0–3 most likely-relevant prior sessions | live by default |\n| **SessionStart** | Cold-start agents (cron, background) hit this; same pointer-block delivery without a user prompt, plus a \"Known failure modes for this repo\" block when signals data exists (see Signals below) | live by default |\n| **SessionEnd** | Delete the per-conversation memo on session close so state files don't accumulate | always on |\n| **Stop** | Scan the model's response for citations of surfaced session IDs → updates `useful_hit_rate` and builds the reranker training substrate | always on |\n| **PreCompact** | Flush the per-conversation surfaced-IDs memo so post-compaction recalls aren't gated | always on |\n| **SubagentStart** | Record parent→subagent links so threads stay coherent across dispatches | always on |\n\n**Hermes Agent** — six hooks installed to `~/.hermes/plugins/nlm-memory/` via `nlm connect hermes-agent`. All calls are fire-and-forget except `pre_llm_call`, which returns a context string for injection:\n\n| Hook | What NLM does | Blocks turn? |\n|---|---|---|\n| **pre_llm_call** | POST prompt to daemon → inject pointer block of relevant prior sessions into context | yes (returns context string) |\n| **post_llm_call** | POST assistant response to daemon → citation scan, `useful_hit_rate` update | no |\n| **on_session_start** | Signal session open → daemon initialises per-session memo | no |\n| **on_session_end** | Signal session close → daemon flushes memo | no |\n| **on_session_finalize** | Signal transcript finalised → triggers async classifier run | no |\n| **on_session_reset** | Signal session reset → daemon clears per-session state | no |\n\n**pi.dev** — one `input` hook registered via `nlm connect pi`. Pi's extension API only exposes `input`, so transcript ingestion is handled separately by the passive adapter scanning `~/.pi/agent/sessions/`:\n\n| Hook | What NLM does | Blocks turn? |\n|---|---|---|\n| **input** | Score user message → if relevant sessions found, prepend pointer block to the prompt text via `{ action: \"transform\" }` | yes (returns transformed text) |\n\nAll three fail-open: any daemon error yields a clean exit and never blocks the model. Switch Claude Code hooks to **shadow** mode (log-only, no injection) with `NLM_HOOK_MODE=shadow`.\n\n### 2. MCP — explicit tools any agent can call\n\nContainer-hosted agents (Hermes WebUI, Codex CLI, etc.) hit the Streamable-HTTP `POST /mcp` endpoint with `Authorization: Bearer ${NLM_MCP_TOKEN}`. Stdio MCP is also supported for Claude Code via `~/.mcp.json`.\n\n---\n\n## Agent self-improvement signals\n\nNLM can ingest structured feedback events (`nlm.signal`) from any tool in your agent stack -- quality gates, eval runners, code reviewers, test harnesses -- and surface the aggregated failure patterns back to the agent at session start. Over time the agent learns which steps tend to fail for a given repo and model, without any external service.\n\n### Payload contract\n\n```jsonc\n{\n  \"v\": 1,\n  \"kind\": \"gate\" | \"eval\" | \"review\" | \"test\",  // required\n  \"outcome\": \"pass\" | \"fail\" | \"fix\" | \"exhausted\",  // required\n  \"producer\": \"quality-gate\",   // defaults to \"unknown\"\n  \"model\": \"qwen3-coder\",       // defaults to \"unknown\"\n  \"repo\": \"/path/or/name\",      // defaults to \"unknown\"\n  \"detail\": { \"step\": \"types\", \"files\": [\"a.ts\"], \"attempt\": 2 },\n  \"session\": \"\u003csession-id-if-known\u003e\",\n  \"ts\": \"2026-06-09T18:00:00.000Z\"  // defaults to now\n}\n```\n\n`kind` and `outcome` are the only required fields; invalid values are rejected with `400`. `install_scope` is stamped server-side -- do not send it.\n\n### Transports\n\n**HTTP (any producer)**\n\n```js\nawait fetch(\"http://localhost:3940/api/signal\", {\n  method: \"POST\",\n  headers: { \"content-type\": \"application/json\" },\n  body: JSON.stringify({\n    kind: \"gate\", producer: \"my-tool\", outcome: \"fail\",\n    model, repo, detail: { step: \"types\" }, ts: new Date().toISOString(),\n  }),\n});\n```\n\nRides the standard `/api/*` loopback gate. When the daemon runs with `NLM_UI_AUTH=cookie`, send `Authorization: Bearer $NLM_MCP_TOKEN`.\n\n**Session-embedded (pi.dev)**\n\nPi producers call `pi.appendEntry(\"nlm.signal\", payload)` inside an extension. This writes:\n\n```json\n{ \"type\": \"custom\", \"customType\": \"nlm.signal\", \"data\": { ...payload } }\n```\n\nto the session `.jsonl`. NLM's pi adapter recognises the `nlm.signal` customType and the scheduler drains it during normal ingest -- no HTTP call required.\n\n### Failure-mode recall\n\nNLM aggregates signals into failure modes per `(repo, model)` pair. A mode surfaces when its fail-rate reaches 20% or higher over at least 10 events in a trailing 14-day window. At session start the Claude Code `SessionStart` hook injects a \"Known failure modes for this repo\" block into the agent prompt automatically (repo-scoped). Any harness can fetch the same data directly:\n\n```\nGET /api/signals/failure-modes?repo=\u003crepo\u003e\u0026model=\u003cmodel\u003e\n```\n\n### Inspection\n\n```sh\nnlm improve   # prints failure modes + recommendations for the current repo\n```\n\nThe UI **Recall** page also has a failure-modes panel. NLM surfaces findings and recommendations only -- it never acts on them.\n\n### Configuration and privacy\n\n| Var | Default | What |\n|---|---|---|\n| `NLM_SIGNALS_ENABLED` | `1` (on) | Set to `0` to disable signal ingest entirely |\n| `NLM_SIGNAL_RETENTION_DAYS` | `90` | Raw signals older than this are pruned |\n\nSignals are local-only. They are stamped with a per-install ID from `~/.nlm/install-id` and never leave the machine.\n\n### Reference producer\n\nThe pi `quality-gate` extension (in the `pi-sandbox` repo) is a ~10-line integration that emits `nlm.signal` per gate step and again on retry exhaustion. It is the canonical example of the session-embedded transport.\n\n---\n\n## MCP Tools\n\n| Tool | What it does |\n|---|---|\n| `recall_sessions` | Hybrid keyword+semantic search across the full session corpus. Returns label, started_at, snippet, match score. |\n| `get_session` | Full body of one session by ID. Includes enriched `supersedes` / `supersededBy` links (id + label + summary) so chasing corrected facts doesn't need a second round-trip. |\n| `recall_facts` | Search structured facts: decisions, open questions, project state. Filterable by entity and kind. |\n| `get_fact_history` | Full version history of one fact — how a decision evolved over time. |\n| `cite_session` | Mark a session as explicitly referenced. Drives the `useful_hit_rate` metric and the future learned reranker. |\n\n**When to call `cite_session`:** Call it when a surfaced session actually changes what you say — you referenced it explicitly, it corrected a decision, or you called `get_session` to read the full body. Do not call it for sessions you scanned and discarded. The Stop hook auto-detects citation when a session ID appears verbatim in your response; `cite_session` covers the deliberate case where the session influenced your reasoning without being quoted directly. Both paths feed the same signal loop.\n\n| `mark_superseded` | Retroactively retire a stale session and point it at the newer one that replaces it. The editable-timeline write path — see [docs/supersedence.md](docs/supersedence.md). |\n\n---\n\n## REST API\n\nDaemon binds `127.0.0.1:3940` (override with `NLM_PORT`). Selected endpoints:\n\n| Method | Path | Auth | Purpose |\n|---|---|---|---|\n| GET | `/api/health` | Host-only | Liveness probe; returns `{version, status, service}` |\n| GET | `/api/recall` | Bearer/Origin | Hybrid recall — `?q=`, `?mode=keyword\\|semantic\\|hybrid`, `?limit=` |\n| GET | `/api/recall/stats` | Bearer/Origin | 7-day stats: total, hit_rate, useful_hit_rate, top queries |\n| GET | `/api/recall/recent` | Bearer/Origin | Last N recall events for live tail/telemetry |\n| GET | `/api/recall/cite-stats` | Bearer/Origin | Citation rate over `?days=` |\n| GET | `/api/session/:id` | Bearer/Origin | Full session body + supersedence links |\n| GET | `/api/recall/facts` | Bearer/Origin | Structured fact search |\n| GET | `/api/facts/history` | Bearer/Origin | Version chain for one fact |\n| GET | `/api/dataset` | Bearer/Origin | Full session list for the UI dataset view |\n| GET | `/api/live/recent-writes` | Bearer/Origin | Live tail of ingested sessions |\n| GET | `/api/data/backup` | Bearer/Origin | Streaming SQLite snapshot download |\n| POST | `/api/data/restore` | Bearer/Origin | Stage a snapshot for apply-on-restart |\n| POST | `/api/hook/pre-compact` | Bearer/Origin | Hook endpoint; flushes the surfaced-IDs memo |\n| ALL | `/mcp` | Bearer required | Streamable-HTTP MCP transport for container agents |\n\n`/api/*` is gated by three layers: 127.0.0.1 Host check (defeats DNS rebinding), Origin check when the browser sends one (defeats cross-origin drive-by), Bearer fallback when Origin is absent (server-to-server clients).\n\n---\n\n## Daily digest\n\nOnce-a-day summary of yesterday's activity:\n\n```sh\nnlm digest                  # print to stdout\nnlm digest --telegram       # post to Telegram (TELEGRAM_BOT_TOKEN + TELEGRAM_CHAT_ID)\n```\n\nReports 24h real-traffic (probes filtered), 7d hit_rate + useful_hit_rate, top 5 queries, and a **`WARN hook silent`** alert when Claude Code ran yesterday but no live hook fires were logged. That alert is the canary for post-install drift — node upgrades, `settings.json` hand-edits, and `dist/` moves silently break the hook while Claude Code keeps working. Setup-time smoke tests can't catch this; only the daily correlation can.\n\nWire to cron for a morning push:\n\n```cron\n0 7 * * *  nlm digest --telegram \u003e\u003e ~/.nlm/logs/digest.log 2\u003e\u00261\n```\n\nWhen the daemon is unreachable, `--telegram` still fires — posts a \"daemon unreachable\" alert instead of failing silently.\n\n---\n\n## What's inside the UI\n\nOpen `http://localhost:3940/ui` after the daemon starts.\n\n| Page | What it shows |\n|---|---|\n| **Live** | Sessions being written in real time, recent reads, recent decisions |\n| **Pulse** | System health — coherence, runtimes, stale entities, recent sessions |\n| **River** | Full session timeline with density controls + superseded-lane visualization |\n| **Thread** | Per-entity conversation history with runtime filters and ←/→ navigation |\n| **Search** | Keyword, semantic, or hybrid recall with match snippets and field-origin tags |\n| **Recall** | Adoption telemetry — useful_hit_rate, source breakdown, query log |\n| **Settings** | Sources, providers, classifier, data backup/restore |\n\n---\n\n## Pipeline\n\nWhat happens when an AI runtime writes a session and you later recall it:\n\n```\ningest:  runtime transcript (jsonl/sqlite)\n   -\u003e adapter parses runtime-specific format\n   -\u003e classifier (Ollama local by default; DeepSeek / OpenAI / Anthropic / OpenRouter / OpenAI-compatible if you opt in) extracts label + entities + decisions + open questions\n   -\u003e embedder (nomic-embed-text via Ollama) computes 768-dim vector\n   -\u003e SQLite canonical store + FTS5 keyword index + sqlite-vec ANN index\n\nrecall: prompt / query\n   -\u003e tokenize + match scoring (label x3, entity-exact x4, decision x2, summary x1, phrase-bonus +5)\n   -\u003e hybrid: BM25-style keyword + vector cosine, fused by score\n   -\u003e select-top-N gate (per-fire cap 3, per-conversation cap 10)\n   -\u003e pointer block prepended to model context (hooks) or returned as tool result (MCP)\n```\n\n---\n\n## Configuration\n\n### Environment variables\n\n| Var | Default | What |\n|---|---|---|\n| `NLM_PORT` | `3940` | Daemon bind port (loopback only) |\n| `NLM_DB_PATH` | `~/.nlm/canonical.sqlite` | SQLite canonical store location |\n| `NLM_HOOK_MODE` | `live` | `live` injects pointer block; `shadow` logs without injecting |\n| `NLM_HOOK_LOG` | `~/.nlm/hook-log.jsonl` | Hook fire log; powers digest's liveness alert |\n| `NLM_USEFUL_HIT_LOG` | `~/.nlm/useful-hit-log.jsonl` | Citation/useful-hit ledger |\n| `NLM_QUERY_LOG` | `~/.nlm/query-log.jsonl` | Recall query telemetry |\n| `NLM_CITATION_LOG` | `~/.nlm/citation-log.jsonl` | Stop-hook citation events |\n| `NLM_MISS_LOG` | `~/.nlm/miss-log.jsonl` | Stop-hook miss events — sessions the agent explicitly fetched via `get_session`/`cite_session` that the hook never surfaced. Reviewed via `nlm misses`. |\n| `NLM_MISS_LOG_ENABLED` | `true` | Set to `0` to disable miss-log emission entirely. |\n| `NLM_MCP_TOKEN` | auto-generated | 256-bit bearer for `/api/*` (non-browser) and `/mcp` |\n| `NLM_MCP_CONFIG` | `~/.mcp.json` | Path the `connect`/`disconnect` commands modify |\n| `NLM_CLASSIFIER` | `ollama` | `ollama` (local, default), `deepseek`, `openai`, `anthropic`, `openrouter`, or `openai-compatible` |\n| `NLM_CLASSIFIER_MODEL` | `qwen3:4b-instruct-2507-q4_K_M` | Model id for the chosen provider. See [classifier bench](reports/classifier-comparison/2026-06-02-deepseek-v4-vs-qwen3.md) for why this is the recommended local default. |\n| `NLM_OLLAMA_URL` | `http://localhost:11434` | Override Ollama endpoint |\n| `NLM_ADAPTERS` | all | Comma-separated allowlist of adapters to enable |\n| `DEEPSEEK_API_KEY` | — | Required only when classifier=deepseek |\n| `NLM_DISABLE_UPDATE_CHECK` | — | Set to `1` to disable the daily npm-registry update check |\n| `NLM_RECALL_DECAY_HALF_LIFE_DAYS` | `180` | Half-life of the recency multiplier applied to recall scores. Older sessions score lower; defaults to 6 months. Set to `0` to disable recency weighting entirely. |\n| `NLM_RECALL_DECAY_FLOOR` | `0.25` | Lower bound on the recency multiplier — even ancient sessions retain at least 25% of their raw score so a perfect-match old session can still surface. |\n| `NLM_RECALL_REWRITE_DEFAULT` | `true` | Default value for the MCP `recall_sessions` `rewrite` parameter. When true, the service runs an LLM rewrite on vague natural-language queries before search. The HTTP hook caller bypasses rewrite regardless (hot-path protection). |\n| `NLM_RECALL_REWRITE_TIMEOUT_MS` | `5000` | Per-call timeout for the rewrite LLM. Separate from the classifier timeout. |\n| `NLM_FACT_CORROBORATION_BOOST_CAP` | `2.0` | Maximum multiplicative boost applied to fact recall scores based on how many sessions corroborate the same `(subject, predicate, value)`. Log-scale: 1 corroboration is 1.0×, 10 is 2.0× (capped). Set to `1.0` to disable the boost — the count is still returned on each hit. |\n| `NLM_HOOK_INJECT_FACTS` | `true` | Whether to attach high-confidence facts about top-hit entities to the pointer block injected by the hook. Set to `0` to disable globally. |\n| `NLM_HOOK_FACT_LIMIT` | `5` | Maximum number of facts in the \"Known facts\" section of the pointer block. |\n| `NLM_HOOK_FACT_MIN_CORROBORATION` | `2` | Minimum number of sessions that must have asserted a fact before it qualifies for hook injection. Set to `1` to include single-source facts. |\n| `NLM_HOOK_FACT_MIN_CONFIDENCE` | `0.7` | Minimum classifier confidence for a fact to qualify for hook injection. |\n| `TELEGRAM_BOT_TOKEN` / `TELEGRAM_CHAT_ID` | — | Required for `nlm digest --telegram` |\n\n### Changing the classifier from the UI\n\nThe CLI env-vars above are one path; the running UI is the other. **Settings → Providers** is a full CRUD list of LLM endpoints (Ollama, DeepSeek, OpenAI, Anthropic, OpenRouter, or any OpenAI-compatible endpoint such as LM Studio, llama.cpp, vLLM, text-generation-webui). Click **Add provider**, point it at your local server, hit **Save \u0026 test**. Then in **Settings → Classifier**, pick that provider and model. No env-var editing required, and the DeepSeek row can be disabled or deleted from the same screen.\n\nAdapter source paths can be overridden individually: `NLM_CLAUDE_PROJECTS_PATH`, `NLM_CODEX_CONFIG`, `NLM_CURSOR_DB_PATH`, `NLM_HERMES_SESSIONS_PATH`, `NLM_HERMES_AGENT_DB_PATH`, `NLM_WINDSURF_USER_DIR`, `OPENCODE_DB_PATH`, `PI_SESSIONS_PATH`, `AIDER_CHAT_HISTORY_FILE`.\n\n### Config file\n\n`~/.nlm/.env` — autoloaded by every CLI command. Mode `0600`, owned by you, never readable by other users. The setup wizard writes the initial keys; you can edit it directly.\n\n### Ports\n\n| Port | Process | Bind | Override |\n|---|---|---|---|\n| `3940` | Daemon HTTP API + MCP | `127.0.0.1` only | `NLM_PORT` |\n| `11434` | Ollama (embedding + local classifier) | localhost | `NLM_OLLAMA_URL` |\n\n---\n\n## Security\n\nNLM is local-first by design, but \"local-first\" is not \"local-only\" — read this section before picking a classifier.\n\n**Daemon hardening (always on):**\n\n- Binds to `127.0.0.1` only — never `0.0.0.0`\n- Enforces Host + Origin checks on `/api/*` to defeat DNS rebinding and cross-origin drive-by\n- Generates a 256-bit `NLM_MCP_TOKEN` on first run, persists to `~/.nlm/.env` (mode `0600`); non-browser clients authenticate with `Authorization: Bearer ${NLM_MCP_TOKEN}` compared with `timingSafeEqual`\n- Recursively enforces `0700` on `~/.nlm/` and `0600` on its contents on every start\n- Optional opt-in UI cookie auth (`NLM_UI_AUTH=cookie`) with HMAC-derived cookie value and nonce-based bootstrap (token never appears in a URL)\n\n**Outbound network traffic — exhaustive list:**\n\n| Destination | When | What data leaves |\n|---|---|---|\n| Configured classifier endpoint | Every new session is classified | Up to ~30K chars of the session transcript (prompts, responses, code snippets — whatever is in the transcript). If the classifier is **Ollama (default)** the destination is `localhost:11434` and nothing leaves the machine. If you opted into DeepSeek / OpenAI / Anthropic / OpenRouter / any OpenAI-compatible endpoint, the transcript is POSTed to that vendor. |\n| Ollama `localhost:11434` | Every new session | 768-dim embedding request (local) |\n| `registry.npmjs.org` | Once per 24h | Anonymous `GET /nlm-memory/latest` for update notifications. Cached at `~/.nlm/update-check.json`. Disable with `NLM_DISABLE_UPDATE_CHECK=1`. |\n| `api.telegram.org` | Only when `nlm digest --telegram` is invoked | Digest content |\n| AI runtime transcript files | Continuous | Read-only filesystem reads |\n\nNo analytics SDK. No crash reporter. No vendor ping beyond the four rows above.\n\n**Honest caveats — known limitations:**\n\n- **Cloud-classifier data egress.** A \"cloud\" classifier (DeepSeek, OpenAI, Anthropic, OpenRouter) by definition sees your session content. Anything pasted into a transcript — API keys, client names, internal URLs — is sent to that vendor under their data-use terms. The setup wizard warns you before you pick one. The default is Ollama for this reason.\n- **Provider API keys are stored in plaintext in SQLite today** (`providers.api_key`, in `~/.nlm/canonical.sqlite`). The file is `0600`, in a `0700` directory, owned by your user — so any process running as your user can read it. OS-keychain migration is on the roadmap; until then, treat the SQLite file like a `.env` file.\n- **The classifier is fed untrusted indexed content.** Sessions written by AI runtimes can contain prompt-injection attempts. The classifier output is structured (label, entities, decisions, open questions) and never executed, but if you wire NLM to an agent that *acts* on classifier output, model that as untrusted input.\n- **The hook fails open.** Any error in the recall hook yields a clean exit so it can't block your model. This means a silently-broken hook is possible — the daily digest's `WARN hook silent` canary is the detection path.\n\nReport vulnerabilities via [SECURITY.md](SECURITY.md).\n\n### Remote access\n\nThe daemon binds to `127.0.0.1`. If you want to reach the UI from another device — phone, second laptop — don't change the bind. Put a tunnel in front instead.\n\n**Tailscale (recommended for personal use).** Run once on the daemon host:\n\n```sh\ntailscale serve --bg http://localhost:3940\n```\n\nThen visit `https://\u003cmachine\u003e.\u003ctailnet\u003e.ts.net/ui/` from any tailnet device. Tailscale Serve rewrites the upstream `Host` header to `localhost:3940`, so the loopback check passes without any nlm-memory config. WireGuard + your tailnet ACLs are the auth layer — for a single-user tailnet this is strictly stronger than `NLM_UI_AUTH=cookie`, so leave that off.\n\n**If you do enable `NLM_UI_AUTH=cookie`** (defense in depth, or you've added untrusted devices to your tailnet), bootstrapping a cookie from a remote device needs one extra step. `nlm ui` only opens a browser on the daemon host; for the remote browser:\n\n```sh\nssh \u003cdaemon-host\u003e 'nlm ui --print'   # mints a nonce, prints the URL\n# Paste the URL into the remote browser within ~60s (nonce TTL)\n```\n\n**Do not expose the daemon directly to the public internet.** The cookie is a shared-HMAC speed bump, not real public-internet auth. If you absolutely must, put it behind something with real authentication (Cloudflare Access, Tailscale Funnel with auth in front, etc.).\n\n---\n\n## Upgrading from v0.4.x\n\n```sh\nnpm update -g nlm-memory\n```\n\nOld installs have `NLM_HOOK_MODE=shadow` hardcoded in `~/.claude/settings.json` — shadow mode is silent, so re-run `nlm hook install` to switch to live recall injection. Permissions and `NLM_MCP_TOKEN` self-heal on the next `nlm start`.\n\n---\n\n## vs Alternatives\n\n| | **nlm-memory** | mem0 | Letta / MemGPT | Built-in (`CLAUDE.md`) |\n|---|---|---|---|---|\n| **Unit of memory** | Whole session + extracted markers | Atomic facts | Graph nodes + edges | Static file |\n| **Cross-runtime** | 9 adapters, one corpus | Per-app SDK integration | Per-app SDK integration | Per-runtime config |\n| **Editable timeline** | Sessions can be superseded, retired, aborted | Append-only fact log | Graph edits | Manual file edits |\n| **R@5 baseline** | 97.2% on 14mo corpus | published varies | published varies | n/a |\n| **External deps** | SQLite + Ollama (local) | Postgres or Qdrant | Postgres | none |\n| **Hosted offering** | none — local only | yes | yes | n/a |\n| **Account required** | none | yes (cloud tier) | yes | none |\n| **Telemetry** | none | yes | yes | none |\n| **License** | Apache 2.0 | Apache 2.0 | Apache 2.0 | — |\n\nThe defining property is the editable timeline. mem0 and Letta append; NLM lets you reach back and mark a session as superseded by a newer one, retire one as no-longer-relevant, or flag one as aborted-mid-flight. The next recall surfaces the corrected version, not the stale one. A claim from 6 months ago can be patched today.\n\n---\n\n## Docs\n\n- [docs/supersedence.md](docs/supersedence.md) — the editable timeline: statuses, what gets recorded when, how supersedence flows through recall\n- [docs/hooks.md](docs/hooks.md) — full hook lifecycle, modes, selection logic, pointer-block format, logging surface, daily liveness canary\n- [docs/methodology-recall-baseline.md](docs/methodology-recall-baseline.md) — how R@5 = 97.2% is measured + how to run LongMemEval-S on your own machine\n- [SECURITY.md](SECURITY.md) — threat model + responsible disclosure\n\n---\n\n## Development\n\n```sh\ngit clone https://github.com/pbmagnet4/nlm-memory\ncd nlm-memory\nnpm install\nnpm run build          # compile dist/ — commit the result, it ships in the repo\nnpm run dev            # hot-reload daemon\nnpm run ui:dev         # hot-reload UI at localhost:5173 (proxies /api to :3940)\nnpm test               # 726 tests across 73 files\nnpm run typecheck\n```\n\nArchitecture: hexagonal. `src/core/` knows about ports (interfaces), not adapters. `src/cli/nlm.ts` is the composition root — the only file that wires concrete implementations (`SqliteSessionStore`, `OllamaClient`, `Hono`, `StdioServerTransport`). Adapters in `src/core/adapters/` are one-way: they parse runtime-specific session formats into NLM's canonical shape; nothing in the runtime sees NLM.\n\n`dist/` is built on install via the `prepare` script (runs automatically on `npm install` from git or registry) and packed into the published tarball via the `files` field. Not tracked in git.\n\n---\n\n## License\n\nApache 2.0 — see [LICENSE](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpbmagnet4%2Fnlm-memory","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpbmagnet4%2Fnlm-memory","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpbmagnet4%2Fnlm-memory/lists"}