https://github.com/pbmagnet4/nlm-memory

Local-first non-linear memory OS for AI operators. One index across Claude Code, Codex, Cursor, Windsurf, Hermes, OpenCode, Aider, pi, and more — with an editable timeline.
https://github.com/pbmagnet4/nlm-memory
ai aider claude-code codex cursor deepseek hermes local-first mcp memory ollama opencode recall session-memory sqlite typescript windsurf
Last synced: about 2 months ago
JSON representation
Local-first non-linear memory OS for AI operators. One index across Claude Code, Codex, Cursor, Windsurf, Hermes, OpenCode, Aider, pi, and more — with an editable timeline.
Host: GitHub
URL: https://github.com/pbmagnet4/nlm-memory
Owner: pbmagnet4
License: apache-2.0
Created: 2026-05-19T14:42:57.000Z (2 months ago)
Default Branch: main
Last Pushed: 2026-06-10T02:43:16.000Z (about 2 months ago)
Last Synced: 2026-06-10T03:13:27.630Z (about 2 months ago)
Topics: ai, aider, claude-code, codex, cursor, deepseek, hermes, local-first, mcp, memory, ollama, opencode, recall, session-memory, sqlite, typescript, windsurf
Language: TypeScript
Homepage: https://www.npmjs.com/package/nlm-memory
Size: 2.46 MB
Stars: 4
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project

README

          


  

    

    

    

  





  

  

  

  

  

  

  

  





  Install ·

  Quick Start ·

  Runtimes ·

  How recall works ·

  Signals ·

  MCP ·

  REST API ·

  Digest ·

  Config ·

  Security ·

  vs Alternatives



---

`nlm-memory` is a local-first memory layer for AI coding agents. It indexes every session from Claude Code, Codex, OpenCode, Cursor, Windsurf, Hermes, Aider, and pi into a single searchable store on your machine. Three properties no other memory layer ships together:

1. **Cross-runtime reach.** One index, every adapter.

2. **Editable timeline.** Sessions can be superseded by newer ones; entities can be retired. Patch history retroactively — no other tool lets you do this. See [docs/supersedence.md](docs/supersedence.md).

3. **97.2% R@5 baseline.** On a 14-month corpus, keyword recall surfaces the right session in the top 5 on 97.2% of evaluator queries. No fine-tuning. The labels were generated by DeepSeek V4 — the retrieval algorithm is the same code path you'll run, but expect a lower number with a smaller local classifier. See [docs/methodology-recall-baseline.md](docs/methodology-recall-baseline.md).

Everything stays on your machine by default. No telemetry, no account. The classifier defaults to local (Ollama); if you opt into a cloud classifier (DeepSeek, OpenAI, Anthropic, OpenRouter, or any OpenAI-compatible endpoint), session transcripts are sent to that provider — see [Security](#security) for the exact data-flow.

---

## Install

```sh

npm install -g nlm-memory

nlm setup

```

`nlm setup` is the interactive first-run wizard. It picks your classifier + model, wires the runtimes you actually use, generates an `NLM_MCP_TOKEN`, hardens permissions on `~/.nlm/`, and installs the daemon supervisor for your platform.

| Platform | Daemon | Notes |

|---|---|---|

| **macOS** | LaunchAgent at `~/Library/LaunchAgents/com.github.pbmagnet4.nlm-memory.plist` | Auto-starts on login |

| **Linux** | systemd user unit at `~/.config/systemd/user/nlm.service` | Headless servers: `loginctl enable-linger $USER` so the daemon survives logout |

| **Windows** | Manual `nlm start` for now | Hook + MCP install paths are platform-aware; supervisor lands next release |

Stop or remove: `nlm uninstall`.

---

## Quick Start

After `nlm setup` finishes, open **http://localhost:3940/ui** — the daemon is running. A 30-second sanity check:

```sh

nlm recall "what was that pgvector decision"   # one-shot search from the shell

nlm digest                                      # yesterday's activity at a glance

nlm --version

```

---

## Runtimes

One corpus across every adapter. MCP works against all nine. **Automatic context injection via hooks** ships on four (Claude Code, Codex CLI, Hermes Agent, pi.dev). Cursor, Windsurf, and OpenCode receive an optional static rules-file nudge (`nlm connect  --with-rules`) that instructs the agent to call `recall_sessions` on history-flavored prompts — those runtimes don't expose a pre-prompt hook today, so a rules nudge is the closest equivalent. Aider is ingest-only. `nlm connect` wires whichever surface the runtime supports:

| Runtime | Connect | Sessions read from | Hooks |

|---|---|---|---|

| **Claude Code** | `nlm connect claude-code` | `~/.claude/projects/**/*.jsonl` | 6 hooks: UserPromptSubmit, SessionStart, SessionEnd, Stop, PreCompact, SubagentStart |

| **Codex CLI** | `nlm connect codex` | `~/.codex/sessions/` | Marketplace plugin (UserPromptSubmit + Stop) |

| **Hermes** (WebUI) | `nlm connect hermes` | `~/.hermes/sessions/` | MCP only (writes the MCP server block to `~/.hermes/config.yaml`) |

| **Hermes Agent** (NousResearch CLI) | `nlm connect hermes-agent` | `~/.hermes/state.db` | 6 hooks: pre_llm_call, post_llm_call, on_session_start/end/finalize/reset (Python plugin in `~/.hermes/plugins/nlm-memory/`) |

| **Cursor** | `nlm connect cursor [--with-rules]` | Cursor IDE chat DB | MCP + optional rules nudge (workspace `.cursor/rules/nlm-recall.mdc`) |

| **Windsurf** | `nlm connect windsurf [--with-rules]` | Windsurf user dir | MCP + optional rules nudge (`~/.codeium/windsurf/memories/global_rules.md`) |

| **OpenCode** | `nlm connect opencode [--with-rules]` | `~/.local/share/opencode/` | MCP + optional rules nudge (`~/.config/opencode/AGENTS.md`) |

| **Aider** | adapter active (ingest only) | `AIDER_CHAT_HISTORY_FILE` | No native MCP, no hooks — sessions are read into the corpus but Aider cannot call recall back |

| **pi.dev** | `nlm setup` (auto) or `nlm connect pi` | `~/.pi/agent/sessions/**/*.jsonl` | 1 hook: input (prepend via transform) |

`nlm disconnect ` reverses any of the above.

---

## How recall works

Two delivery paths. They share the same index.

### 1. Hooks — automatic context injection

Hooks fire on user input and prepend a pointer block of likely-relevant prior sessions to the model's context. Four runtimes ship hooks today: Claude Code (six-hook lifecycle), Codex CLI (UserPromptSubmit + Stop via the marketplace plugin), Hermes Agent (six parallel hooks), and pi.dev (one `input` hook via [nlm/](nlm/README.md)). Cursor, Windsurf, and OpenCode don't expose a pre-prompt hook today, so the `--with-rules` install path drops a static rules snippet that asks the agent to call `recall_sessions` itself on history-flavored prompts (see [docs/hooks.md](docs/hooks.md) for the snippet). Full lifecycle, modes, logging surface, and the daily liveness canary documented in [docs/hooks.md](docs/hooks.md).

**Claude Code** — six hooks written to `~/.claude/settings.json` via `nlm connect claude-code`:

| Event | What NLM does | Mode |

|---|---|---|

| **UserPromptSubmit** | Score the prompt, silently prepend pointer block listing 0–3 most likely-relevant prior sessions | live by default |

| **SessionStart** | Cold-start agents (cron, background) hit this; same pointer-block delivery without a user prompt, plus a "Known failure modes for this repo" block when signals data exists (see Signals below) | live by default |

| **SessionEnd** | Delete the per-conversation memo on session close so state files don't accumulate | always on |

| **Stop** | Scan the model's response for citations of surfaced session IDs → updates `useful_hit_rate` and builds the reranker training substrate | always on |

| **PreCompact** | Flush the per-conversation surfaced-IDs memo so post-compaction recalls aren't gated | always on |

| **SubagentStart** | Record parent→subagent links so threads stay coherent across dispatches | always on |

**Hermes Agent** — six hooks installed to `~/.hermes/plugins/nlm-memory/` via `nlm connect hermes-agent`. All calls are fire-and-forget except `pre_llm_call`, which returns a context string for injection:

| Hook | What NLM does | Blocks turn? |

|---|---|---|

| **pre_llm_call** | POST prompt to daemon → inject pointer block of relevant prior sessions into context | yes (returns context string) |

| **post_llm_call** | POST assistant response to daemon → citation scan, `useful_hit_rate` update | no |

| **on_session_start** | Signal session open → daemon initialises per-session memo | no |

| **on_session_end** | Signal session close → daemon flushes memo | no |

| **on_session_finalize** | Signal transcript finalised → triggers async classifier run | no |

| **on_session_reset** | Signal session reset → daemon clears per-session state | no |

**pi.dev** — one `input` hook registered via `nlm connect pi`. Pi's extension API only exposes `input`, so transcript ingestion is handled separately by the passive adapter scanning `~/.pi/agent/sessions/`:

| Hook | What NLM does | Blocks turn? |

|---|---|---|

| **input** | Score user message → if relevant sessions found, prepend pointer block to the prompt text via `{ action: "transform" }` | yes (returns transformed text) |

All three fail-open: any daemon error yields a clean exit and never blocks the model. Switch Claude Code hooks to **shadow** mode (log-only, no injection) with `NLM_HOOK_MODE=shadow`.

### 2. MCP — explicit tools any agent can call

Container-hosted agents (Hermes WebUI, Codex CLI, etc.) hit the Streamable-HTTP `POST /mcp` endpoint with `Authorization: Bearer ${NLM_MCP_TOKEN}`. Stdio MCP is also supported for Claude Code via `~/.mcp.json`.

---

## Agent self-improvement signals

NLM can ingest structured feedback events (`nlm.signal`) from any tool in your agent stack -- quality gates, eval runners, code reviewers, test harnesses -- and surface the aggregated failure patterns back to the agent at session start. Over time the agent learns which steps tend to fail for a given repo and model, without any external service.

### Payload contract

```jsonc

{

  "v": 1,

  "kind": "gate" | "eval" | "review" | "test",  // required

  "outcome": "pass" | "fail" | "fix" | "exhausted",  // required

  "producer": "quality-gate",   // defaults to "unknown"

  "model": "qwen3-coder",       // defaults to "unknown"

  "repo": "/path/or/name",      // defaults to "unknown"

  "detail": { "step": "types", "files": ["a.ts"], "attempt": 2 },

  "session": "",

  "ts": "2026-06-09T18:00:00.000Z"  // defaults to now

}

```

`kind` and `outcome` are the only required fields; invalid values are rejected with `400`. `install_scope` is stamped server-side -- do not send it.

### Transports

**HTTP (any producer)**

```js

await fetch("http://localhost:3940/api/signal", {

  method: "POST",

  headers: { "content-type": "application/json" },

  body: JSON.stringify({

    kind: "gate", producer: "my-tool", outcome: "fail",

    model, repo, detail: { step: "types" }, ts: new Date().toISOString(),

  }),

});

```

Rides the standard `/api/*` loopback gate. When the daemon runs with `NLM_UI_AUTH=cookie`, send `Authorization: Bearer $NLM_MCP_TOKEN`.

**Session-embedded (pi.dev)**

Pi producers call `pi.appendEntry("nlm.signal", payload)` inside an extension. This writes:

```json

{ "type": "custom", "customType": "nlm.signal", "data": { ...payload } }

```

to the session `.jsonl`. NLM's pi adapter recognises the `nlm.signal` customType and the scheduler drains it during normal ingest -- no HTTP call required.

### Failure-mode recall

NLM aggregates signals into failure modes per `(repo, model)` pair. A mode surfaces when its fail-rate reaches 20% or higher over at least 10 events in a trailing 14-day window. At session start the Claude Code `SessionStart` hook injects a "Known failure modes for this repo" block into the agent prompt automatically (repo-scoped). Any harness can fetch the same data directly:

```

GET /api/signals/failure-modes?repo=&model=

```

### Inspection

```sh

nlm improve   # prints failure modes + recommendations for the current repo

```

The UI **Recall** page also has a failure-modes panel. NLM surfaces findings and recommendations only -- it never acts on them.

### Configuration and privacy

| Var | Default | What |

|---|---|---|

| `NLM_SIGNALS_ENABLED` | `1` (on) | Set to `0` to disable signal ingest entirely |

| `NLM_SIGNAL_RETENTION_DAYS` | `90` | Raw signals older than this are pruned |

Signals are local-only. They are stamped with a per-install ID from `~/.nlm/install-id` and never leave the machine.

### Reference producer

The pi `quality-gate` extension (in the `pi-sandbox` repo) is a ~10-line integration that emits `nlm.signal` per gate step and again on retry exhaustion. It is the canonical example of the session-embedded transport.

---

## MCP Tools

| Tool | What it does |

|---|---|

| `recall_sessions` | Hybrid keyword+semantic search across the full session corpus. Returns label, started_at, snippet, match score. |

| `get_session` | Full body of one session by ID. Includes enriched `supersedes` / `supersededBy` links (id + label + summary) so chasing corrected facts doesn't need a second round-trip. |

| `recall_facts` | Search structured facts: decisions, open questions, project state. Filterable by entity and kind. |

| `get_fact_history` | Full version history of one fact — how a decision evolved over time. |

| `cite_session` | Mark a session as explicitly referenced. Drives the `useful_hit_rate` metric and the future learned reranker. |

**When to call `cite_session`:** Call it when a surfaced session actually changes what you say — you referenced it explicitly, it corrected a decision, or you called `get_session` to read the full body. Do not call it for sessions you scanned and discarded. The Stop hook auto-detects citation when a session ID appears verbatim in your response; `cite_session` covers the deliberate case where the session influenced your reasoning without being quoted directly. Both paths feed the same signal loop.

| `mark_superseded` | Retroactively retire a stale session and point it at the newer one that replaces it. The editable-timeline write path — see [docs/supersedence.md](docs/supersedence.md). |

---

## REST API

Daemon binds `127.0.0.1:3940` (override with `NLM_PORT`). Selected endpoints:

| Method | Path | Auth | Purpose |

|---|---|---|---|

| GET | `/api/health` | Host-only | Liveness probe; returns `{version, status, service}` |

| GET | `/api/recall` | Bearer/Origin | Hybrid recall — `?q=`, `?mode=keyword\|semantic\|hybrid`, `?limit=` |

| GET | `/api/recall/stats` | Bearer/Origin | 7-day stats: total, hit_rate, useful_hit_rate, top queries |

| GET | `/api/recall/recent` | Bearer/Origin | Last N recall events for live tail/telemetry |

| GET | `/api/recall/cite-stats` | Bearer/Origin | Citation rate over `?days=` |

| GET | `/api/session/:id` | Bearer/Origin | Full session body + supersedence links |

| GET | `/api/recall/facts` | Bearer/Origin | Structured fact search |

| GET | `/api/facts/history` | Bearer/Origin | Version chain for one fact |

| GET | `/api/dataset` | Bearer/Origin | Full session list for the UI dataset view |

| GET | `/api/live/recent-writes` | Bearer/Origin | Live tail of ingested sessions |

| GET | `/api/data/backup` | Bearer/Origin | Streaming SQLite snapshot download |

| POST | `/api/data/restore` | Bearer/Origin | Stage a snapshot for apply-on-restart |

| POST | `/api/hook/pre-compact` | Bearer/Origin | Hook endpoint; flushes the surfaced-IDs memo |

| ALL | `/mcp` | Bearer required | Streamable-HTTP MCP transport for container agents |

`/api/*` is gated by three layers: 127.0.0.1 Host check (defeats DNS rebinding), Origin check when the browser sends one (defeats cross-origin drive-by), Bearer fallback when Origin is absent (server-to-server clients).

---

## Daily digest

Once-a-day summary of yesterday's activity:

```sh

nlm digest                  # print to stdout

nlm digest --telegram       # post to Telegram (TELEGRAM_BOT_TOKEN + TELEGRAM_CHAT_ID)

```

Reports 24h real-traffic (probes filtered), 7d hit_rate + useful_hit_rate, top 5 queries, and a **`WARN hook silent`** alert when Claude Code ran yesterday but no live hook fires were logged. That alert is the canary for post-install drift — node upgrades, `settings.json` hand-edits, and `dist/` moves silently break the hook while Claude Code keeps working. Setup-time smoke tests can't catch this; only the daily correlation can.

Wire to cron for a morning push:

```cron

0 7 * * *  nlm digest --telegram >> ~/.nlm/logs/digest.log 2>&1

```

When the daemon is unreachable, `--telegram` still fires — posts a "daemon unreachable" alert instead of failing silently.

---

## What's inside the UI

Open `http://localhost:3940/ui` after the daemon starts.

| Page | What it shows |

|---|---|

| **Live** | Sessions being written in real time, recent reads, recent decisions |

| **Pulse** | System health — coherence, runtimes, stale entities, recent sessions |

| **River** | Full session timeline with density controls + superseded-lane visualization |

| **Thread** | Per-entity conversation history with runtime filters and ←/→ navigation |

| **Search** | Keyword, semantic, or hybrid recall with match snippets and field-origin tags |

| **Recall** | Adoption telemetry — useful_hit_rate, source breakdown, query log |

| **Settings** | Sources, providers, classifier, data backup/restore |

---

## Pipeline

What happens when an AI runtime writes a session and you later recall it:

```

ingest:  runtime transcript (jsonl/sqlite)

   -> adapter parses runtime-specific format

   -> classifier (Ollama local by default; DeepSeek / OpenAI / Anthropic / OpenRouter / OpenAI-compatible if you opt in) extracts label + entities + decisions + open questions

   -> embedder (nomic-embed-text via Ollama) computes 768-dim vector

   -> SQLite canonical store + FTS5 keyword index + sqlite-vec ANN index

recall: prompt / query

   -> tokenize + match scoring (label x3, entity-exact x4, decision x2, summary x1, phrase-bonus +5)

   -> hybrid: BM25-style keyword + vector cosine, fused by score

   -> select-top-N gate (per-fire cap 3, per-conversation cap 10)

   -> pointer block prepended to model context (hooks) or returned as tool result (MCP)

```

---

## Configuration

### Environment variables

| Var | Default | What |

|---|---|---|

| `NLM_PORT` | `3940` | Daemon bind port (loopback only) |

| `NLM_DB_PATH` | `~/.nlm/canonical.sqlite` | SQLite canonical store location |

| `NLM_HOOK_MODE` | `live` | `live` injects pointer block; `shadow` logs without injecting |

| `NLM_HOOK_LOG` | `~/.nlm/hook-log.jsonl` | Hook fire log; powers digest's liveness alert |

| `NLM_USEFUL_HIT_LOG` | `~/.nlm/useful-hit-log.jsonl` | Citation/useful-hit ledger |

| `NLM_QUERY_LOG` | `~/.nlm/query-log.jsonl` | Recall query telemetry |

| `NLM_CITATION_LOG` | `~/.nlm/citation-log.jsonl` | Stop-hook citation events |

| `NLM_MISS_LOG` | `~/.nlm/miss-log.jsonl` | Stop-hook miss events — sessions the agent explicitly fetched via `get_session`/`cite_session` that the hook never surfaced. Reviewed via `nlm misses`. |

| `NLM_MISS_LOG_ENABLED` | `true` | Set to `0` to disable miss-log emission entirely. |

| `NLM_MCP_TOKEN` | auto-generated | 256-bit bearer for `/api/*` (non-browser) and `/mcp` |

| `NLM_MCP_CONFIG` | `~/.mcp.json` | Path the `connect`/`disconnect` commands modify |

| `NLM_CLASSIFIER` | `ollama` | `ollama` (local, default), `deepseek`, `openai`, `anthropic`, `openrouter`, or `openai-compatible` |

| `NLM_CLASSIFIER_MODEL` | `qwen3:4b-instruct-2507-q4_K_M` | Model id for the chosen provider. See [classifier bench](reports/classifier-comparison/2026-06-02-deepseek-v4-vs-qwen3.md) for why this is the recommended local default. |

| `NLM_OLLAMA_URL` | `http://localhost:11434` | Override Ollama endpoint |

| `NLM_ADAPTERS` | all | Comma-separated allowlist of adapters to enable |

| `DEEPSEEK_API_KEY` | — | Required only when classifier=deepseek |

| `NLM_DISABLE_UPDATE_CHECK` | — | Set to `1` to disable the daily npm-registry update check |

| `NLM_RECALL_DECAY_HALF_LIFE_DAYS` | `180` | Half-life of the recency multiplier applied to recall scores. Older sessions score lower; defaults to 6 months. Set to `0` to disable recency weighting entirely. |

| `NLM_RECALL_DECAY_FLOOR` | `0.25` | Lower bound on the recency multiplier — even ancient sessions retain at least 25% of their raw score so a perfect-match old session can still surface. |

| `NLM_RECALL_REWRITE_DEFAULT` | `true` | Default value for the MCP `recall_sessions` `rewrite` parameter. When true, the service runs an LLM rewrite on vague natural-language queries before search. The HTTP hook caller bypasses rewrite regardless (hot-path protection). |

| `NLM_RECALL_REWRITE_TIMEOUT_MS` | `5000` | Per-call timeout for the rewrite LLM. Separate from the classifier timeout. |

| `NLM_FACT_CORROBORATION_BOOST_CAP` | `2.0` | Maximum multiplicative boost applied to fact recall scores based on how many sessions corroborate the same `(subject, predicate, value)`. Log-scale: 1 corroboration is 1.0×, 10 is 2.0× (capped). Set to `1.0` to disable the boost — the count is still returned on each hit. |

| `NLM_HOOK_INJECT_FACTS` | `true` | Whether to attach high-confidence facts about top-hit entities to the pointer block injected by the hook. Set to `0` to disable globally. |

| `NLM_HOOK_FACT_LIMIT` | `5` | Maximum number of facts in the "Known facts" section of the pointer block. |

| `NLM_HOOK_FACT_MIN_CORROBORATION` | `2` | Minimum number of sessions that must have asserted a fact before it qualifies for hook injection. Set to `1` to include single-source facts. |

| `NLM_HOOK_FACT_MIN_CONFIDENCE` | `0.7` | Minimum classifier confidence for a fact to qualify for hook injection. |

| `TELEGRAM_BOT_TOKEN` / `TELEGRAM_CHAT_ID` | — | Required for `nlm digest --telegram` |

### Changing the classifier from the UI

The CLI env-vars above are one path; the running UI is the other. **Settings → Providers** is a full CRUD list of LLM endpoints (Ollama, DeepSeek, OpenAI, Anthropic, OpenRouter, or any OpenAI-compatible endpoint such as LM Studio, llama.cpp, vLLM, text-generation-webui). Click **Add provider**, point it at your local server, hit **Save & test**. Then in **Settings → Classifier**, pick that provider and model. No env-var editing required, and the DeepSeek row can be disabled or deleted from the same screen.

Adapter source paths can be overridden individually: `NLM_CLAUDE_PROJECTS_PATH`, `NLM_CODEX_CONFIG`, `NLM_CURSOR_DB_PATH`, `NLM_HERMES_SESSIONS_PATH`, `NLM_HERMES_AGENT_DB_PATH`, `NLM_WINDSURF_USER_DIR`, `OPENCODE_DB_PATH`, `PI_SESSIONS_PATH`, `AIDER_CHAT_HISTORY_FILE`.

### Config file

`~/.nlm/.env` — autoloaded by every CLI command. Mode `0600`, owned by you, never readable by other users. The setup wizard writes the initial keys; you can edit it directly.

### Ports

| Port | Process | Bind | Override |

|---|---|---|---|

| `3940` | Daemon HTTP API + MCP | `127.0.0.1` only | `NLM_PORT` |

| `11434` | Ollama (embedding + local classifier) | localhost | `NLM_OLLAMA_URL` |

---

## Security

NLM is local-first by design, but "local-first" is not "local-only" — read this section before picking a classifier.

**Daemon hardening (always on):**

- Binds to `127.0.0.1` only — never `0.0.0.0`

- Enforces Host + Origin checks on `/api/*` to defeat DNS rebinding and cross-origin drive-by

- Generates a 256-bit `NLM_MCP_TOKEN` on first run, persists to `~/.nlm/.env` (mode `0600`); non-browser clients authenticate with `Authorization: Bearer ${NLM_MCP_TOKEN}` compared with `timingSafeEqual`

- Recursively enforces `0700` on `~/.nlm/` and `0600` on its contents on every start

- Optional opt-in UI cookie auth (`NLM_UI_AUTH=cookie`) with HMAC-derived cookie value and nonce-based bootstrap (token never appears in a URL)

**Outbound network traffic — exhaustive list:**

| Destination | When | What data leaves |

|---|---|---|

| Configured classifier endpoint | Every new session is classified | Up to ~30K chars of the session transcript (prompts, responses, code snippets — whatever is in the transcript). If the classifier is **Ollama (default)** the destination is `localhost:11434` and nothing leaves the machine. If you opted into DeepSeek / OpenAI / Anthropic / OpenRouter / any OpenAI-compatible endpoint, the transcript is POSTed to that vendor. |

| Ollama `localhost:11434` | Every new session | 768-dim embedding request (local) |

| `registry.npmjs.org` | Once per 24h | Anonymous `GET /nlm-memory/latest` for update notifications. Cached at `~/.nlm/update-check.json`. Disable with `NLM_DISABLE_UPDATE_CHECK=1`. |

| `api.telegram.org` | Only when `nlm digest --telegram` is invoked | Digest content |

| AI runtime transcript files | Continuous | Read-only filesystem reads |

No analytics SDK. No crash reporter. No vendor ping beyond the four rows above.

**Honest caveats — known limitations:**

- **Cloud-classifier data egress.** A "cloud" classifier (DeepSeek, OpenAI, Anthropic, OpenRouter) by definition sees your session content. Anything pasted into a transcript — API keys, client names, internal URLs — is sent to that vendor under their data-use terms. The setup wizard warns you before you pick one. The default is Ollama for this reason.

- **Provider API keys are stored in plaintext in SQLite today** (`providers.api_key`, in `~/.nlm/canonical.sqlite`). The file is `0600`, in a `0700` directory, owned by your user — so any process running as your user can read it. OS-keychain migration is on the roadmap; until then, treat the SQLite file like a `.env` file.

- **The classifier is fed untrusted indexed content.** Sessions written by AI runtimes can contain prompt-injection attempts. The classifier output is structured (label, entities, decisions, open questions) and never executed, but if you wire NLM to an agent that *acts* on classifier output, model that as untrusted input.

- **The hook fails open.** Any error in the recall hook yields a clean exit so it can't block your model. This means a silently-broken hook is possible — the daily digest's `WARN hook silent` canary is the detection path.

Report vulnerabilities via [SECURITY.md](SECURITY.md).

### Remote access

The daemon binds to `127.0.0.1`. If you want to reach the UI from another device — phone, second laptop — don't change the bind. Put a tunnel in front instead.

**Tailscale (recommended for personal use).** Run once on the daemon host:

```sh

tailscale serve --bg http://localhost:3940

```

Then visit `https://..ts.net/ui/` from any tailnet device. Tailscale Serve rewrites the upstream `Host` header to `localhost:3940`, so the loopback check passes without any nlm-memory config. WireGuard + your tailnet ACLs are the auth layer — for a single-user tailnet this is strictly stronger than `NLM_UI_AUTH=cookie`, so leave that off.

**If you do enable `NLM_UI_AUTH=cookie`** (defense in depth, or you've added untrusted devices to your tailnet), bootstrapping a cookie from a remote device needs one extra step. `nlm ui` only opens a browser on the daemon host; for the remote browser:

```sh

ssh  'nlm ui --print'   # mints a nonce, prints the URL

# Paste the URL into the remote browser within ~60s (nonce TTL)

```

**Do not expose the daemon directly to the public internet.** The cookie is a shared-HMAC speed bump, not real public-internet auth. If you absolutely must, put it behind something with real authentication (Cloudflare Access, Tailscale Funnel with auth in front, etc.).

---

## Upgrading from v0.4.x

```sh

npm update -g nlm-memory

```

Old installs have `NLM_HOOK_MODE=shadow` hardcoded in `~/.claude/settings.json` — shadow mode is silent, so re-run `nlm hook install` to switch to live recall injection. Permissions and `NLM_MCP_TOKEN` self-heal on the next `nlm start`.

---

## vs Alternatives

| | **nlm-memory** | mem0 | Letta / MemGPT | Built-in (`CLAUDE.md`) |

|---|---|---|---|---|

| **Unit of memory** | Whole session + extracted markers | Atomic facts | Graph nodes + edges | Static file |

| **Cross-runtime** | 9 adapters, one corpus | Per-app SDK integration | Per-app SDK integration | Per-runtime config |

| **Editable timeline** | Sessions can be superseded, retired, aborted | Append-only fact log | Graph edits | Manual file edits |

| **R@5 baseline** | 97.2% on 14mo corpus | published varies | published varies | n/a |

| **External deps** | SQLite + Ollama (local) | Postgres or Qdrant | Postgres | none |

| **Hosted offering** | none — local only | yes | yes | n/a |

| **Account required** | none | yes (cloud tier) | yes | none |

| **Telemetry** | none | yes | yes | none |

| **License** | Apache 2.0 | Apache 2.0 | Apache 2.0 | — |

The defining property is the editable timeline. mem0 and Letta append; NLM lets you reach back and mark a session as superseded by a newer one, retire one as no-longer-relevant, or flag one as aborted-mid-flight. The next recall surfaces the corrected version, not the stale one. A claim from 6 months ago can be patched today.

---

## Docs

- [docs/supersedence.md](docs/supersedence.md) — the editable timeline: statuses, what gets recorded when, how supersedence flows through recall

- [docs/hooks.md](docs/hooks.md) — full hook lifecycle, modes, selection logic, pointer-block format, logging surface, daily liveness canary

- [docs/methodology-recall-baseline.md](docs/methodology-recall-baseline.md) — how R@5 = 97.2% is measured + how to run LongMemEval-S on your own machine

- [SECURITY.md](SECURITY.md) — threat model + responsible disclosure

---

## Development

```sh

git clone https://github.com/pbmagnet4/nlm-memory

cd nlm-memory

npm install

npm run build          # compile dist/ — commit the result, it ships in the repo

npm run dev            # hot-reload daemon

npm run ui:dev         # hot-reload UI at localhost:5173 (proxies /api to :3940)

npm test               # 726 tests across 73 files

npm run typecheck

```

Architecture: hexagonal. `src/core/` knows about ports (interfaces), not adapters. `src/cli/nlm.ts` is the composition root — the only file that wires concrete implementations (`SqliteSessionStore`, `OllamaClient`, `Hono`, `StdioServerTransport`). Adapters in `src/core/adapters/` are one-way: they parse runtime-specific session formats into NLM's canonical shape; nothing in the runtime sees NLM.

`dist/` is built on install via the `prepare` script (runs automatically on `npm install` from git or registry) and packed into the published tarball via the `files` field. Not tracked in git.

---

## License

Apache 2.0 — see [LICENSE](LICENSE).
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/pbmagnet4/nlm-memory

Awesome Lists containing this project

README