https://github.com/lyr1cs/rein

Multi-source cross-validated memory MCP server for AI agents (Rust, jieba CJK, local-first, AGPL)
https://github.com/lyr1cs/rein
agpl ai-agents chinese-nlp claude-code codex embeddings gemini jieba knowledge-graph local-first mcp memory rust sqlite vector-search
Last synced: about 1 month ago
JSON representation
Multi-source cross-validated memory MCP server for AI agents (Rust, jieba CJK, local-first, AGPL)
Host: GitHub
URL: https://github.com/lyr1cs/rein
Owner: lyr1cs
License: other
Created: 2026-03-24T10:11:47.000Z (4 months ago)
Default Branch: master
Last Pushed: 2026-06-10T08:31:28.000Z (about 1 month ago)
Last Synced: 2026-06-10T10:15:57.865Z (about 1 month ago)
Topics: agpl, ai-agents, chinese-nlp, claude-code, codex, embeddings, gemini, jieba, knowledge-graph, local-first, mcp, memory, rust, sqlite, vector-search
Language: Rust
Homepage: https://github.com/lyr1cs/rein/releases/latest
Size: 3.86 MB
Stars: 8
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Security: SECURITY.md
- Agents: AGENTS.md
Awesome Lists containing this project

README

          # rein

> Multi-source cross-validated memory for AI agents



  English | 中文



---

## English

rein is a self-adaptive memory system for AI coding agents. It stores, recalls, and manages memories across sessions with embedding-based semantic dedup, data-driven decay (Kaplan-Meier survival curves), and a fully closed self-learning loop that replaces fixed parameters with learned values.

**Current release: `v1.0.0`** (2026-05-31) — the 1.0 freeze. Built on the v0.38 schema-versioning foundation, v1.0 commits to a stable surface: the **baseline schema is frozen** behind the forward-migration framework (regression-tested), the **40 MCP tool arg-schemas are pinned**, the **MSRV is pinned to Rust 1.86** (CI-gated), config gains a **`config_version`** key + load-time downgrade guard, and a **`/v1/*` REST versioned alias** (header-token authed; the GUI session cookie stays `Path=/api`) ships alongside `/api`. The headline feature is **#A5 durable triple persistence** — the framework's first real forward migration (the `memory_triples` fact table at schema version 2), populated behind the default-off `[dedup].persist_triples` flag; the default recall/dedup path is bit-identical to v0.38. MSRV-honesty fix: replaced three `is_multiple_of` (Rust 1.87) uses with `%`. 1470 lib tests pass; clippy + fmt clean; agent-team audit (PII / correctness / hygiene / docs) clean.

**Recent releases (`v0.33` → `v0.35`)** — the eval-gate harness went from foundation to full: the dedup / admission / latency gates moved from `NoData` stubs to working scorers with committed baselines and 20-fixture corpora per gate (v0.33.0/.1, v0.35.0). Trust & Measurement Phase 3 landed its first slice — `repair_advice` + `judge_drift_alert_total` (v0.35.0); claude.ai remote-MCP polish added a sliding session cookie + metadata JSON (v0.35.0); and the bearer-auth migration progressed from a `rein doctor` WARN on the legacy loopback bool (v0.34.0) to its load-time removal (v0.35.0).

**Trust & Measurement Phase 2 (`v0.32.0`)** — the `eval::gates` module: `GateScorecard` / `Gate` trait / `compare_scorecards` (8-rule classification + paired McNemar non-inferiority), the recall gate over a hermetic 20-fixture corpus, the `rein-eval gate {baseline,run,compare,status}` CLI, `rein_trust_measurement` reading real scorecards, and `rein doctor check_eval_gates`.

**Earlier hardening arc (v0.30.x → v0.31.x, 8 releases, 30 codex audit rounds)** — built-in OAuth 2.0 provider for Claude Cowork / claude.ai / mobile remote MCP (DCR + PKCE S256 + RFC 9728/8414, `[server].auth` policy, SQLite-backed clients/grants/signing keys), JSON-RPC envelope on `/mcp` 4xx responses for claude.ai UI surfaceability, recall-launch warmup fragility fixes + 23-round codex audit, OAuth security A-H1 (kid strict match) / A-H2 (migration schema gate) / A-H3 (30s SHA-256 bearer cache + 60s debounced `mark_client_used`), recovery-path tail (`TantivyFts::open_existing` / symlink chain cycle detection / stale `.rebuilding` TTL recovery / `atomic_write_string` chown preservation), and build-path hygiene (`CARGO_ENCODED_RUSTFLAGS` with `--remap-path-prefix=$HOME=user`). Tagged in git (`cargo install --git ... --tag v0.30.0..v0.31.3`).

**Earlier (v0.28.x distribution arc)** — `[mcp_servers.]` Codex 0.129 compat (v0.28.15/16), rmcp 1.6 host-guard bridge (v0.28.13), `--locked install` footgun docs (v0.28.14), `rein_feedback` MCP `inputSchema` hotfix (v0.28.12), second-pass audit hardening on v0.28.7 (v0.28.8, 17 codex review rounds), `[ars.acceleration]` + runtime LLM judge + Trust & Measurement default-on (v0.28.6).

For the full GitHub-ready manual, see [docs/manual/README.md](docs/manual/README.md). Reference tables live under [docs/reference/](docs/reference/).

### Features

| Feature | Description |

|---------|-------------|

| **40 MCP tools** | core memory ops, knowledge graph, temporal recall, adaptive maintenance, ARS feedback (Cap A mirror, Cap B synthesis, Cap C archival summary), runtime LLM judge enqueue, ARS acceleration release-gate inspection, and Trust & Measurement reporting. All authored once via `#[op]` macro (v0.21+) and exposed through CLI / MCP / REST simultaneously. |

| **Unified operation registry** | One `#[op]` declaration drives CLI / MCP / REST surfaces (v0.21, A1). Inventory-based dispatch; zero hand-maintained lists. |

| **Neural Wiki GUI** | React + Tailwind web dashboard with Brain View, Adaptive Engine, Knowledge Graph, Timeline, and more |

| **Self-adaptive engine** | M1-M6: all learning loops closed — data drives fusion weights, decay curves, dedup thresholds, and tier boundaries |

| **Counterfactual alpha learning** | Replays past recalls to find optimal CC fusion weights — global, per-query-type, and per-cluster; bucket confidence accumulates across learning windows as a decay-weighted effective sample size (M2, v1.1) |

| **Per-cluster survival decay** | Kaplan-Meier curves replace fixed Ebbinghaus per-cluster; global prior bridges cold-start (M3) |

| **HDBSCAN clustering** | Pure Rust semantic clustering with sampling for large datasets; churn-gated recluster cadence keeps cluster-scoped learning alive between runs (M4, v1.1) |

| **Hot/Warm/Cold tiering** | Streaming quantile estimator + cold_archive migration (M5) |

| **Adaptive dedup thresholds** | Per-cluster P90 similarity thresholds (SemDeDup-inspired, M6/A1) |

| **Provenance-preserving dedup** | Merges preserve temporal anchors and unique details instead of hard-deleting |

| **Embedding semantic dedup** | Catches paraphrases Jaccard misses, runs in GC slow channel (zero hot-path cost) |

| **Temporal knowledge graph** | Memoir / Concept / ConceptLink with 9 relation types, revision history, episode nodes, temporal validity windows, BFS traversal (skips expired links) |

| **Autonomous retrieval routing** | Rule-based query classifier routes to 6 strategies: Episodic / Temporal / Preference / ExactKeyword / Semantic / Exploratory (zero LLM calls) |

| **Query expansion** | LLM rewrites query into 2-3 variants (Gemini Flash Lite / OMLX); multi-query results merged before fusion |

| **LLM reranker** | Optional Gemini / OMLX reordering of top-N candidates via score-envelope-preserving permutation (only the LLM's order is used — existing scores are redistributed, zero scale constants, v1.1); strong-signal bypass skips LLM when confidence is already high |

| **Maximal Marginal Relevance** | MMR post-rerank diversity pass — balances relevance and variety in final result set |

| **OMLX local embedding** | Optional local embedding backend via EmbedderKind enum dispatch (Google / OMLX) |

| **Dual-layer decay** | LTM / STM layers with KM survival curves (data-driven) or Ebbinghaus (cold-start) |

| **Dual-path search** | FTS (Tantivy BM25 → FTS5 fallback) + Vector (HNSW cache → API embed) → RRF/CC fusion |

| **Multi-source cross-validation** | 3 sources (local, hook-extracted, Supermemory) with confidence scoring |

| **RRF / CC fusion** | Reciprocal Rank Fusion or Convex Combination (Bruch 2023), with learned alpha weights |

| **Multi-factor admission** | A-MAC 2026 inspired: llm_conf + novelty + type_prior + recency scoring |

| **Semantic chunking** | Heading / paragraph / sentence splitting with metadata-prefixed embeddings |

| **Tantivy + FTS5 text search** | Tantivy BM25 side index with SQLite FTS5 fallback; CJK lexical handling is covered by jieba-rs plus character bigrams |

| **Hybrid CJK dedup tokenization** | jieba-rs word segmentation plus character bigrams for Chinese/Japanese/Korean lexical dedup |

| **Cluster-aware admission** | admission threshold and novelty scoring incorporate cluster strength, cluster novelty, and cold-start blending |

| **Evidence second-stage rerank** | low-confidence / single-source recall results can be boosted by matching evidence content |

| **Survival-driven STM promotion** | STM→LTM promotion uses cluster survival curves when available |

| **ANN fallback for unclustered dedup** | large `cluster_id=None` buckets generate vector-neighbor candidates before pairwise dedup |

| **Adaptive cluster decisions UI** | Adaptive page surfaces per-cluster dedup/admission/promotion decisions |

| **Supermemory v4 API** | Hybrid search via `api.supermemory.ai/v4/search` for cross-validation |

| **Zero local models** | No GPU required by default; optional OMLX local backend |

| **~2-5 MB footprint** | Single SQLite file with FTS5 + sqlite-vec |

| **gemini-embedding-001** | Default Google embedding model, 3072 dimensions; benchmark claims are documented as dated provider references |

| **20+ CLI commands** | Everything the MCP tools do, plus init, config, migrate, hooks, recent, gc, organize, upgrade |

| **Auto-configure** | `rein init` detects and configures 8 MCP clients automatically |

| **Remote access** | HTTP / SSE transport with bearer token authentication |

### Installation

Three install paths depending on your client:

| Client                                                              | Recommended path                                                      |

| ------------------------------------------------------------------- | --------------------------------------------------------------------- |

| Claude Desktop **Chat tab** on macOS Apple Silicon                  | [DXT one-click](#install-on-claude-desktop-dxt--macos-apple-silicon)  |

| Claude Code (CLI)                                                   | [Plugin marketplace](#install-via-claude-code-plugin-marketplace)     |

| Claude **Cowork** / claude.ai / mobile (cloud-routed remote MCP)    | [Remote MCP custom connector](#install-via-remote-mcp-cowork-claudeai-mobile) |

| Anything else, or you want full control                             | [From source](#from-source)                                           |

#### Install on Claude Desktop (DXT — macOS Apple Silicon)

rein ships as a Claude Desktop Extension (`.mcpb`). One-click install,

no Rust toolchain required.

1. Download `rein-v0.37.0.mcpb` from

   [the v0.37.0 release](https://github.com/lyr1cs/rein/releases/download/v0.37.0/rein-v0.37.0.mcpb).

2. Clear macOS quarantine (one-time, the build is unsigned):

   ```bash

   xattr -d com.apple.quarantine ~/Downloads/rein-v0.37.0.mcpb

   ```

3. Double-click the file. Claude Desktop opens its install dialog.

4. Fill in `Gemini API Key` (required); leave `Memory database path` and

   `Supermemory API Key` blank to use defaults.

5. Click Install. Claude Desktop spawns `rein serve` over stdio. ~40

   `rein_*` tools appear in your next chat.

For step-by-step including upgrade, uninstall, and troubleshooting, see

[docs/manual/02-installation.md → Claude Desktop](docs/manual/02-installation.md#claude-desktop-one-click-via-dxt).

For maintainers, the build pipeline is documented in

[docs/guides/dxt-build.md](docs/guides/dxt-build.md).

> **Other platforms** (Intel Mac / Linux / Windows): the DXT bundle is

> macOS Apple Silicon only. Use the

> [Claude Code plugin marketplace](#install-via-claude-code-plugin-marketplace)

> path or `cargo install` instead.

#### Install via Claude Code plugin marketplace

```text

/plugin marketplace add lyr1cs/rein

/plugin install rein@rein

```

The plugin registers the rein MCP server entry in Claude Code. You still

need the `rein` binary on your `PATH`:

```bash

# Latest master

cargo install --git https://github.com/lyr1cs/rein --locked rein

# Or pin to a specific release tag (recommended for reproducible installs)

cargo install --git https://github.com/lyr1cs/rein --tag v0.37.0 --locked rein

```

> **`--locked` is required.** Without it, `cargo install --git` ignores

> the committed `Cargo.lock` and re-resolves every transitive dependency

> to the latest semver-compatible version on crates.io, which can pull

> in C/SIMD code that requires a newer host toolchain than your machine

> ships. See

> [docs/manual/02-installation.md → Remote Install](docs/manual/02-installation.md#remote-install-pinned-tag).

>

> This installs the default non-GUI binary. For the built-in OAuth

> remote-connector flow, use the GUI-enabled source install in

> [Recipe E](docs/manual/02b-remote-mcp-deployment.md#recipe-e-built-in-oauth-provider-recommended-for-v030)

> so the owner approval pages are embedded.

Then set `GEMINI_API_KEY` in your shell environment or `~/.rein/config.toml`.

See [docs/manual/02-installation.md](docs/manual/02-installation.md) for

full configuration.

#### Install via Remote MCP (Cowork, claude.ai, mobile)

Claude Cowork (the agentic-work tab inside Claude Desktop), claude.ai

web, and the Claude mobile apps route MCP traffic through Anthropic's

cloud, **not** through your local stdio. They cannot use the DXT or

plugin marketplace paths above. They need a public HTTPS endpoint that

Anthropic's servers can reach.

rein already implements the **Streamable HTTP** transport (current MCP

standard) at `/mcp` on its built-in HTTP server. Three prerequisites

before exposing it:

1. Set `[server].auth = "public"` in `~/.rein/config.toml` (loopback bind,

   non-discoverable public hostname will forward through) **OR**

   set `REIN_HTTP_TOKEN` to a strong secret and have your reverse

   proxy inject `Authorization: Bearer …` upstream. The other policies

   are `"loopback_only"` (strict local), `"bearer_required"` (token

   required), and `"oauth"` (built-in OAuth provider, recommended for

   multi-user remote MCP).

2. Add the public hostname to `[server].allowed_hosts` in

   `~/.rein/config.toml`, e.g.,

   `allowed_hosts = ["rein.your-domain.com"]`. Without this, rein

   returns `403 Host header is not allowed` for any request whose

   `Host:` is not localhost / 127.0.0.1 / ::1, and your tunnel will

   appear broken even when up.

3. Start the listener:

   ```bash

   rein serve --sse        # listens on 127.0.0.1:8680/mcp

   ```

Without one of the auth postures in step 1, `rein serve --sse` exits

immediately with `REIN_HTTP_TOKEN must be set`. Without step 2 the

listener starts but rejects every tunnel request.

Expose the listener publicly via one of (no Tailscale or domain

required for the simpler ones):

- **Cloudflare Quick Tunnel** — `cloudflared tunnel --url

  http://localhost:8680`. **No account, no domain.** Random

  `*.trycloudflare.com` URL, ephemeral. Lowest friction; works on

  most networks (use `--protocol http2` if your network blocks QUIC).

- **ngrok** — free account, no domain. URL ephemeral on free tier;

  reserved domains on paid. Tends to work on networks where Quick

  Tunnel can't.

- **Tailscale Funnel** — free account, no domain. Permanent

  `*.ts.net` URL. Most network-tolerant of the no-domain options.

- **Cloudflare Tunnel + own domain** — permanent URL under your

  domain, optional Cloudflare Access OIDC for real edge auth.

- **Caddy / nginx + Let's Encrypt** — self-hosted on your own VPS.

Then in Claude: **Customize → Connectors → "+" → Add custom connector**,

paste your URL (e.g., `https://rein.your-domain.com/mcp`), optionally

add OAuth credentials.

Step-by-step recipes for each tunnel option, the Claude UI configuration

flow, and authentication tradeoffs are in

[docs/manual/02b-remote-mcp-deployment.md](docs/manual/02b-remote-mcp-deployment.md).

> **Note**: custom remote connectors require a Pro / Max / Team /

> Enterprise Claude account. Free-plan users can use the local-stdio

> paths (DXT, plugin marketplace) but not Cowork.

#### From source

```bash

git clone https://github.com/lyr1cs/rein.git

cd rein

# Standard build (CLI + MCP server only)

cargo install --path crates/rein --locked

# Full build with Neural Wiki GUI (recommended)

cd crates/rein/gui && npm ci && npm run build && cd ../../..

cargo install --path crates/rein --locked --features gui

```

Or use the install script:

```bash

./scripts/install.sh

```

The install script builds the embedded GUI by default when `npm` is available. Set `REIN_INSTALL_GUI=0` for a CLI-only install.

#### Prerequisites

- Rust toolchain (1.75+)

- Node.js + npm (for GUI builds or the default install script path)

- A Gemini API key (free tier: 1500 req/day)

#### GUI Service Management

```bash

# Start GUI server in background (listens on :8680)

rein gui on

# Stop GUI server

rein gui off

# Or run in foreground with MCP + GUI

rein serve --gui

# Open in browser

open http://localhost:8680

```

The GUI requires building with `--features gui`. Without it, the `gui` subcommand is available but serves no embedded assets.

### Quick Start

```bash

# 1. Set your API key

export GEMINI_API_KEY="your-key-here"

# 2. Auto-configure all detected MCP clients

rein init

# 3. Start the MCP server (usually done by your client)

rein serve

```

### CLI Reference

| Command | Description | Example |

|---------|-------------|---------|

| `serve` | Start MCP server (stdio, SSE, proxy, or GUI) | `rein serve [--compact] [--sse] [--proxy] [--gui]` |

| `store` | Store a memory | `rein store -t debug -c "OOM fix" -I high -k oom,memory` |

| `recall` | Search memories | `rein recall "connection pool" -t debug -l 5` |

| `forget` | Delete a memory by ID | `rein forget 01J...` |

| `update` | Update memory content | `rein update 01J... -c "new content" -I critical` |

| `topics` | List all topics | `rein topics` |

| `stats` | Show store statistics | `rein stats` |

| `health` | Check topic health | `rein health [topic]` |

| `consolidate` | Merge one or many topics into consolidated memories | `rein consolidate --pattern 'rmcp*' --merge-variants --dry-run` |

| `dedup` | Scan / remove duplicates, optionally across topic variants | `rein dedup [--dry-run] [--merge-variants]` |

| `cleanup` | One-click consolidation + dedup + adaptive refresh | `rein cleanup [topic] [--pattern 'rmcp*'] [--all] [--dry-run]` |

| `migrate` | Import from QMD / reindex | `rein migrate [--from-qmd path] [--reindex]` |

| `init` | Auto-configure MCP clients | `rein init [--dry-run]` |

| `config` | Show current configuration | `rein config` |

| `recent` | Show most recent memories | `rein recent [-l 20]` |

| `canonicals` | Show canonical memories | `rein canonicals [-l 20]` |

| `evidence` | Show evidence snapshots for a canonical memory | `rein evidence  [-l 20]` |

| `dedup-log` | Show recent dedup decisions | `rein dedup-log [--canonical ID] [-l 20]` |

| `gc` | Garbage collect weak STM memories | `rein gc [--dry-run]` |

| `organize` | Auto-link related memories | `rein organize` |

| `dedup-concepts` | Merge duplicate concepts (case/separator variants) | `rein dedup-concepts` |

| `resummerize` | Run LLM-driven canonical recompression (v0.23) | `rein resummerize [--dry-run] [--canonical-id ID]` |

| `upgrade` | Upgrade old memories to knowledge graph | `rein upgrade [--topic X] [--dry-run]` |

| `hook session-start` | Optional Codex project context injection | `rein hook session-start` |

| `hook pre` | Codex deny-only PreToolUse guardrails | `rein hook pre` |

| `hook permission` | Codex deny-only PermissionRequest guardrails | `rein hook permission` |

| `hook post` | Extract facts from tool output | `rein hook post` |

| `hook compact` | Save context before compaction | `rein hook compact` |

| `hook prompt` | Optional Codex UserPromptSubmit memory context injection | `rein hook prompt` |

| `hook stop` | Full knowledge extraction on session end | `rein hook stop` |

| `worker memory` | Drain the async memory queue | `rein worker memory` |

| `worker dedup-queue` | Drain queued store-time dedup jobs | `rein worker dedup-queue` |

| `worker cleanup-queue` | Drain queued cleanup jobs | `rein worker cleanup-queue` |

| `dashboard` | Show service status, metrics, memory stats | `rein dashboard` |

| `gui on/off` | Start/stop GUI server in background | `rein gui on` |

| `proxy on/off` | Start/stop proxy in background | `rein proxy on` |

### How Cleanup Works (Provenance-Preserving)

rein's cleanup pipeline is **provenance-preserving**: it never hard-deletes information. The process has three stages:

1. **Consolidation** — Groups topic variants (e.g., `Docker Deployment` / `docker-deployment`) and merges all memories within each group into a single high-quality canonical memory. Source memories become evidence records in the `memory_evidence` table, preserving their original content, timestamps, and keywords.

2. **Dedup** — Scans for content-level duplicates within each topic group using lexical similarity (Jaccard + containment) and optionally embedding cosine similarity. Matches above threshold are merged into the winner; the loser's unique lines are appended with provenance markers (`[merged from  on ]`) and the loser is recorded as evidence.

3. **Adaptive refresh** — After consolidation and dedup, the adaptive engine (M1-M6) runs: HDBSCAN re-clusters, survival curves rebuild, tier boundaries update, and alpha/threshold learning processes new events.

Every merge decision is logged in the `dedup_decisions` append-only ledger with winner/loser IDs, scores, relation type, confidence, and operator. This is rein's equivalent of Git's reflog — you can always trace how a canonical memory was formed.

```bash

# Preview what cleanup would do (safe)

rein cleanup --all --dry-run

# Run cleanup on a specific topic

rein cleanup "docker-deployment"

# Full store cleanup

rein cleanup --all

# Run cleanup through the worker entrypoint

rein worker cleanup --all

```

`consolidate` keeps the old `rein consolidate  -s "summary"` flow, but also supports:

- `--topics a,b,c` to batch a named topic set

- `--pattern 'rmcp*'` to batch by glob

- `--all` to process every topic

- `--merge-variants` to group case/space/hyphen variants such as `Docker Deployment` / `docker-deployment`

- omitting `--summary` to let rein auto-generate a consolidated memory, using the configured LLM when available and a local fallback otherwise

Batch consolidation fans out LLM synthesis asynchronously and in parallel, then commits SQLite writes sequentially. Cleanup actions also emit adaptive feedback and refresh M1-M6 state after the batch completes.

Cleanup is now scoped-first:

- `rein cleanup X`, `rein cleanup --topics ...`, or `rein cleanup --pattern ...` only deduplicates the selected groups

- destructive full-store cleanup requires `rein cleanup --all`

- `rein cleanup --dry-run` previews the scope

- background-style cleanup is handled by `rein worker cleanup ...`, `rein worker cleanup-queue`, and the cleanup queue worker

Store-time gray-zone dedup now also uses a dedicated async queue:

- hot-path store creates the new memory without blocking on remote LLM verdicts

- a `dedup-queue` worker later resolves gray-zone pairs with structured LLM verdicts

- you can drain it manually with `rein worker dedup-queue`

Recall is now evidence-aware:

- canonical memories are ranked with `support_count` and `source_diversity`

- recall output includes lightweight `evidence_preview`

- `rein evidence ` or `/api/memories/:id` expands the full evidence list

- lower-confidence / lower-corroboration results can use evidence second-stage rerank

Adaptive learning now sees richer canonical signals:

- reranker learning uses support / diversity features

- alpha optimization uses KG / episode / support / diversity-aware candidate scoring

- Adaptive GUI surfaces cluster-level dedup / admission / promotion decisions

CJK dedup now uses a hybrid lexical strategy:

- `jieba-rs` adds Chinese word segmentation

- character bigrams remain enabled as a fallback for CJK and mixed technical text

- both token streams are combined before Jaccard / containment scoring

More detailed docs:

- `docs/guides/canonical-read-model.md`

- `docs/guides/evidence-aware-recall.md`

- `docs/reference/adaptive-learning-signals.md`

Audit / handoff commit chain:

- `8b9e747`

- `b358100`

- `b861a4f`

- `1b0765a`

- `45de919`

- `d92170a`

- `d7200b3`

Operator inspection commands:

- `rein canonicals` shows canonical memories and their support/merge counters

- `rein evidence ` shows absorbed evidence snapshots

- `rein dedup-log` shows the recent dedup ledger

### MCP Tools

When running as an MCP server (`rein serve`), Rein exposes 40 production MCP

tools through the operation inventory. The authoritative list is maintained in

[docs/reference/mcp-tools.md](docs/reference/mcp-tools.md), grouped as:

- Core memory: store, recall, update, forget, recent, topics, canonicals,

  evidence, stats, and health.

- Maintenance: GC, dedup, concept dedup, organize, consolidate, cleanup,

  resummerize, and archive summary refresh.

- Knowledge graph and temporal: memoir tools, concept state, concept summary

  refresh, timeline, and concept history.

- Adaptive, session, ARS, and judge: feedback, adaptive status, session ingest,

  synthesis judge, and concept-summary judge.

#### Knowledge Graph Relation Types

`part_of`, `depends_on`, `related_to`, `contradicts`, `refines`, `alternative_to`, `caused_by`, `instance_of`, `superseded_by`

### LLM Extraction (v0.3)

rein uses LLM (Gemini 3.1 Flash Lite or local models via OMLX) for structured memory extraction. The hook system automatically builds a knowledge graph from coding sessions.

**Architecture:**

- `hook_post` — local pattern extraction (crash safety net) + buffer to session file

- `hook_compact` — record compact context for async extraction

- `hook_stop` — queue full session distillation: memories + concepts + links + episode summary

- `hook_session_start` / `hook_prompt` — optional Codex additionalContext injection from Rein's working surfaces

- `hook_pre_tool_use` / `hook_permission_request` — deny-only Codex guardrails for obviously destructive shell commands

**Upgrade old memories:**

```bash

rein upgrade --dry-run    # preview

rein upgrade              # convert all old memories to knowledge graph

rein upgrade --topic debug  # convert specific topic only

```

**Configuration:**

```toml

[extract]

provider = "google"    # or "omlx" or "none"

[extract.google]

model = "gemini-3.1-flash-lite-preview"

max_input_chars = 0    # 0 = no truncation (1M token model)

[extract.omlx]

endpoint = "http://localhost:11434/v1"  # Ollama, LM Studio, vLLM, etc.

model = "default"

max_input_chars = 16000

```

### Self-Learning Quality System (v0.3.0)

rein automatically learns which memories are useful and which are noise, without human parameter tuning.

**How it works:**

1. LLM assigns `quality_confidence` (0-1) at extraction time — zero extra API cost

2. System tracks recall-then-access patterns to classify memories as "good" (used) or "bad" (recalled but unused)

3. Feature weights auto-adjust from data: utility, novelty, connectivity, recency

4. Adaptive admission threshold rises when recent quality is low, relaxes when high

5. GC prunes low-quality concepts whose source memories are recalled 5+ times but never accessed

**No manual tuning needed** — cold-starts with LLM judgment, data gradually takes over.

Based on: ICLR 2026 Admission Control, PropMem (Prosus), FActScore, MACLA Bayesian posteriors.

### Canonical-First Recall

rein now treats canonical memories as the default read model:

- store-time dedup tries to merge gray-zone writes into an existing canonical when evidence already exists

- admission/novelty scoring uses the current canonical view, not raw topic fragments

- working-set and always-on surfaces are refreshed from persisted canonical memories

- recall returns canonical memories by default, with `evidence_preview` for absorbed observations

- detail endpoints and GUI panels expand the full supporting evidence on demand

For API compatibility, `GET /api/memories/:id` returns the legacy top-level memory fields and also includes:

- `memory`: the canonical memory payload

- `evidence`: supporting evidence snapshots

### Temporal Knowledge Graph (v0.4.0)

rein now tracks **when** knowledge changes, not just what the current state is. Inspired by Zep/Graphiti 2025.

**Capabilities:**

- **Concept revision history** — every `refine_concept` auto-snapshots the old state before overwriting

- **Episode nodes** — each session creates an Episode linking to concepts and memories touched

- **Temporal link validity** — ConceptLink has `valid_from`/`valid_until` windows; expired links are skipped in BFS

- **Contradiction detection** — when a new definition differs significantly (sim < 0.3), old outgoing links are expired

- **Temporal recall** — `rein_recall` supports `from`/`to` date params for time-range filtering

- **Timeline view** — `rein_timeline` shows chronological events (episodes, concept changes, memory creation)

- **Concept history** — `rein_concept_history` shows how a concept's definition evolved over time

**Example queries enabled:**

- "What changed last week?" → `rein_timeline --from 2026-03-19 --to 2026-03-26`

- "When did concept X change?" → `rein_concept_history --memoir rust --name ownership`

- "What did I know about Y before March?" → `rein_recall "Y" --to 2026-03-01`

### Autonomous Retrieval Routing (v0.4.0)

rein automatically classifies queries and routes them to the optimal search strategy — no configuration needed.

| Query Type | Example | Strategy |

|------------|---------|----------|

| **Temporal** | "when did the API change?" | BM25 bias (alpha=0.7), auto-inject time bounds |

| **ExactKeyword** | "SqliteStore", "fn recall" | Heavy BM25 (alpha=0.85) |

| **Semantic** | "memory management strategies" | Vector dominant (alpha=0.3) |

| **Exploratory** | "what do I know about rein?" | Balanced (alpha=0.5), 2x result limit |

Classification is rule-based (zero LLM calls, sub-microsecond). MCP responses include `[route: type]` prefix for transparency. TA-Mem 2026 and MemR3 2025 are tracked as related memory-retrieval background, not as implemented retrieval controllers.

### Adaptive Engine (v0.6.0+)

rein's core philosophy is to minimize fixed parameters through data-driven adaptation. Bootstrap defaults still exist for cold start and safety, but the adaptive engine moves fusion, decay, tiering, and threshold behavior toward observed feedback in the slow channel.

**Pipeline: M4 → A1 → M3 → M5 → M2 → M6**

| Module | What it learns | How |

|--------|---------------|-----|

| **M1** Event Sourcing | *(foundation)* | Append-only feedback log + per-consumer offsets |

| **M2** Alpha Optimizer | CC fusion weights — global, per-query-type, **and per-cluster** | Counterfactual replay; hierarchical Bayesian shrinkage; `apply_max_step` damping |

| **M3** Survival Analysis | Per-cluster decay curves + **global cold-start prior** | Kaplan-Meier estimator; global prior (capped at blend-zone) for new clusters |

| **M4** HDBSCAN Clustering | Semantic neighborhoods | Pure Rust HDBSCAN (dendrogram → condensed tree → EOMBST); centroid reassignment on recluster |

| **M5** Tiering | Hot/Warm/Cold boundaries | Streaming quantile estimator (P25/P75) + cold_archive migration |

| **M6** Threshold Explorer | Dedup thresholds | Randomized threshold exploration + comparative outcome rates + co-recall signal |

| **A1** Per-cluster dedup thresholds | Similarity cutoffs per cluster | P90 of intra-cluster pairwise similarity; full pipeline (store, batch, vec dedup) |

**Also:**

- **Embedding-based semantic dedup** in GC slow channel (catches paraphrases Jaccard misses)

- **Provenance-preserving merge** — temporal anchors and unique details never lost

- **Snapshot CAS** — adaptive state saved with read-merge-write on version conflict

### Recent releases

The v0.21 → v0.38.0 arc rebuilt rein around six axes: a unified operation registry, an adaptive read-side synthesis (ARS) stack with feedback-driven gates, secure remote MCP deployment for Claude clients, a reproducible eval-gate harness for Trust & Measurement, an algorithm + performance pass on the recall and dedup hot paths, and a schema-versioning foundation for safe forward migrations.

| Version | Theme | Highlights |

|---|---|---|

| **v1.0.0** (2026-05-31) | The 1.0 freeze + #A5 triple persistence | Stable-surface freeze: baseline schema frozen behind the migration framework (`baseline_schema_is_frozen`), 40 MCP tool arg-schemas pinned (`mcp_tool_arg_surface_is_frozen`), MSRV pinned to Rust 1.86 (CI `msrv` job; three `is_multiple_of`/1.87 uses replaced with `%`), `config_version` key + load-time downgrade guard, and a `/v1/*` REST versioned alias (header-token authed; GUI session cookie stays `Path=/api`). Headline feature: **#A5 durable triple persistence** — first real forward migration (`memory_triples` at schema v2) behind the default-off `[dedup].persist_triples` flag; default path bit-identical. 96 config fields documented. **1470 lib tests / 0 fail; clippy + fmt clean; agent-team audit clean (PII / correctness / hygiene / docs).** |

| **v0.38.0** (2026-05-30) | Schema-versioning foundation | A single global `PRAGMA user_version` counter + a fail-loud, atomic forward-migration framework (`BASELINE_SCHEMA_VERSION` + `Migration{version,name,up}` + ascending `MIGRATIONS`, empty at landing) replace the additive-only probe-then-ALTER bring-up — unlocking rename/type-change/drop migrations (prerequisite for triple persistence, the fact-layer refactor, and a v1.0 schema freeze). Every legacy `ADD COLUMN` reachable from bring-up fails loud on real errors (tolerating only the benign duplicate-column race); downgrade guard + resurrection-safety gating + in-lock double-checked migration apply. No new MCP tools; default recall/algorithm path bit-identical. **1710 tests / 0 fail / 7 ignored.** |

| **v0.37.0** (2026-05-30) | Algorithm + hooks | **#A18** explicit negative feedback: `rein_feedback` `helpful: false` trains the M2 alpha optimizer + multi-dimensional shadow weights as a parameter-free symmetric negative sample (accessed memories ranked up, explicitly-unhelpful ones ranked down); recall held non-inferior on the eval gate. **Hooks** ingestion de-dup: identical content surfaced by multiple agents / hook sources is collapsed at the queue. **#C2** dedup-threshold sweep re-confirmed; production threshold held pending live-traffic calibration. **1702 tests / 0 fail / 7 ignored.** |

| **v0.36.0** (2026-05-29) | Algorithm + performance pass | **#P1** recall strong-signal fast-path: skip the KG + Supermemory fallback channels when a dominant BM25 hit survives every drop-filter and the local index alone satisfies the requested limit — deterministic, recall held non-inferior on the eval gate. **#C2** `rein-eval gate sweep --gate dedup`: data-driven precision/recall/F1 threshold sweep reporting a merge-safe (precision = 1.0) optimum. **#C3** admission gray-zone corpus 6 → 10 fixtures, ≥ 0.07 edge margin, drift-guard test. **#ablation** `rein-eval ablate`: multi-arm bootstrap-CI ablation with paired deltas + significance, reproducible (seeded PRNG). **1692 tests / 0 fail / 7 ignored.** |

| **v0.33 → v0.35** (2026-05-28) | Eval-gate harness: foundation → full | The dedup / admission / latency gates moved from `NoData` stubs to working scorers with committed baselines + 20-fixture corpora per gate (v0.33.0/.1, v0.35.0). Trust & Measurement Phase 3 first slice: `repair_advice` + `judge_drift_alert_total` (v0.35.0). claude.ai remote-MCP polish: sliding session cookie + metadata JSON (v0.35.0). Bearer-auth migration: `rein doctor` WARN on the legacy loopback bool (v0.34.0) → load-time removal (v0.35.0). |

| **v0.32.0** (2026-05-18) | Trust & Measurement Phase 2 — eval-gate harness | New `eval::gates` module: `GateScorecard` / `Gate` trait / `compare_scorecards` 8-rule pipeline (presence / schema / identity / kind / freshness / stub / duplicate-id / strict id-set + paired McNemar non-inferiority). Recall gate ships full-impl over a 20-fixture corpus seeded into hermetic in-memory `SqliteStore` per fixture (deterministic, no live LLM/embedding). The dedup / admission / latency gates ship as `NoData` stubs here (taken to full in v0.33–v0.35). New `rein-eval gate {baseline,run,compare,status}` CLI subcommand; `rein_trust_measurement` reads real scorecards; `rein doctor` surfaces stale baselines / Bail / mis-wired / corrupt as WARN. Reference baseline at `docs/eval-baselines/recall.json` (score = 1.000). **1647 tests / 0 fail / 5 ignored.** |

| **v0.31.4** (2026-05-17) | Attribution metadata | All maintainer attribution (Cargo authors, LICENSE / README / CONTRIBUTING copyright, DXT / plugin / marketplace manifests, git config `user.name`) switched to the maintainer's GitHub handle. Metadata-only patch, no source / behavior / test change. |

| **v0.31.0 → v0.31.3** (2026-05-16/17) | OAuth security + recovery-paths + build-path hygiene | 4 releases. **v0.31.0**: OAuth A-H1 JWT `kid` strict match (closes forgery-via-rotation overlap), A-H2 `migrate_oauth_tables` schema-version gate + crash-recovery NULL cleanup, A-H3 `verify_bearer` 30s SHA-256-keyed cache + 60s-debounced `mark_client_used` + pool-backed fast path + 5-site cache invalidation. **v0.31.1**: D1 atomic gen-recheck-inside-cache-lock + D2 bearer cache key DB-identity scoping. **v0.31.2**: D1 `TantivyFts::open_existing` non-creating recall variant + D3 symlink chain cycle detection (256-hop ceiling) + D4 stale Tantivy `.rebuilding` marker TTL recovery + D5 `atomic_write_string` chown preservation. **v0.31.3**: `scripts/build-dxt.sh` `CARGO_ENCODED_RUSTFLAGS` with `--remap-path-prefix=$HOME=user` so `file!()` macro expansions no longer bake builder paths into release binaries. |

| **v0.30.0 → v0.30.2** (2026-05-10/12) | Built-in OAuth provider + recall-launch warmup | **v0.30.0**: explicit `[server].auth` policy (`loopback_only` / `bearer_required` / `oauth` / `public`) and a single-user OAuth provider for Claude Cowork / claude.ai / mobile remote MCP — Authorization Server metadata, Dynamic Client Registration, Authorization Code + PKCE S256, refresh rotation, revocation, SQLite-backed clients/grants/signing keys, owner approval GUI, Connectors management UI, OAuth GC, doctor checks. Validated end-to-end via Cloudflare Quick Tunnel. **v0.30.1**: every `/mcp` 4xx ships as a JSON-RPC 2.0 error envelope so claude.ai surfaces real rejection reasons instead of opaque "An unknown error occurred"; `rein doctor` `oauth_provider` WARN on `auth_policy = "oauth"` + clients registered + no active grants. **v0.30.2**: P0 recall launch-failure fix — warmup B1 (sync Tantivy rebuild blocking recall) / B2 (destructive `remove_dir_all`) / B4 (cold-start unconditional rebuild) / B5 (`.rebuilding` lifecycle without recovery) chain producing user-visible "rein won't start" despite process-level boot succeeding. 23 codex audit rounds; 5 corner-case deferrals to v0.31.x. |

| **v0.28.8** (2026-05-04) | v0.28.7 follow-up audit | **17 codex review rounds** (R1–R17) reaching 2-consecutive-clean saturation. **15 P2 + 1 P3** closed; 0 P1 throughout. Headline: **M-8 cluster-bucket alignment** — learn-time bucket resolution now prefers memory-id-remap against current `memory_clusters` (R13 fix for the M4-then-M2 pipeline order that invalidated `cluster_version_at_recall` for every event in the common path). **L6 fallback preservation** — `learned_shadow_fusion` LRU restricted to cluster-scoped buckets (`{query_type}:{cluster_id}` shape via `is_cluster_scoped_bucket` predicate), so the `global` + per-query-type fallback chain stays intact under high cardinality. **`ars_parameter_policy` schema robustness** — schema_version peek before typed deserialize (R8 fix for `Corrupt` mis-classification on future schemas), CAS predicate uses schema-aware COALESCE default (R8), `>` rather than `!=` for future-schema preservation (R15), and `repair_corrupt_parameter_policy` wraps load+delete in `BEGIN IMMEDIATE` (R10 race fix). **M-1 persistence-side** — 4 new per-surface `ars_effective_scalars` keys (`judge_sample_rate_{cold_start,warm}_{synthesis,concept_summary}`) with one-time legacy fallback so the per-surface split lands without breaking downgrade compat. **M-5 / M-6** rollback static threshold anchoring + outer simplex↔legacy blend by `runtime_adoption_weight`. Plus L1 `sanitize_bootstrap_priors` cap, L4 auth-policy regression locks for `/api/trust-measurement` + `/api/ars-acceleration-gate`, L5 doctor recovery, L7 release-gate test coverage. **1462 tests / 0 fail / 3 ignored / 0 clippy / 0 fmt.** Default-OFF behavior bit-identical to v0.28.7. |

| **v0.28.7** (2026-05-02) | v0.28 audit hardening | Closes 4 HIGH + 4 MED items from the 2026-05-02 v0.28 audit. **H0** reverts `[ars.llm_judge]` + `[ars.llm_judge.nightly_cron]` defaults from `true` (v0.28.6) back to `false` in code AND embedded `default.toml` per the v0.28 charter Non-Goal "Do not make LLM judge default-on" — runtime LLM judge stays opt-in until v0.29 surface-policy gating. `[ars.acceleration]` stays `true`. **H1** `bootstrap_priors_from_replay` replay consumer guarded against the placeholder `signal_hint` producer (real producer deferred to v0.29) — consumer never advances against an empty source. **H2** `apply_local_fixes` performs a drift-triggered canary→shadow rollback: when `judge_calibration_state.judge_drift_alert*` is positive while the policy is in Canary, doctor refreshes the row to flip back to Shadow with `runtime_adoption_weight = 0`. **H3** shadow `route_context` buckets isolated in a separate `CONCEPT_SUMMARY_BY_CLUSTER_SHADOW_CAP = 4096` LRU; recall via the shadow path cannot evict production cache entries. **M-1** `JudgeSurface` threaded through 5 helpers + handlers for per-surface drift visibility (Synthesis vs ConceptSummary). **M-2** `bootstrap_priors_from_replay` watermark cutoff uses state watermark (D3 replay-idempotence). **M-9** `DrainStats` per-reason counters + `tracing::warn` on dropped cap + doctor `judge_call_ledger` saturation check. **M-4** docs-only. 1419 tests / 0 fail / 3 ignored / 3 `codex review --uncommitted` rounds. M-1 persistence-side residual + LOW/NIT items deferred to v0.29. |

| **v0.28.6** (2026-05-02) | ARS default-on + Trust & Measurement | Enables `[ars.acceleration]`, runtime LLM judge, and nightly calibration by default while keeping runtime adoption fail-closed behind `ars_parameter_policy`; adds scoped adoption weights for recall fusion/query/cluster and scalar surfaces, keeps SignalHint feedback active outside shadow mode, exposes scoped weights in release-gate output, and adds `rein_trust_measurement` / `rein trust-measurement` / `/api/trust-measurement`. |

| **v0.28.5** (2026-05-01) | Gradual ARS runtime adoption | Adds `runtime_adoption_weight` to `ars_parameter_policy`, moves the adoption weight by at most 0.05 per durable snapshot, and gates recall fusion, synthesis/concept gates, judge sample rates, LLM feedback decay, and SignalHint-derived useful-rate priors through that weight. |

| **v0.28.4** (2026-05-01) | ARS acceleration full pass | Wires SignalHint/bootstrap priors into useful-rate formulas, persists smoothed dynamic scalars, splits judge drift by surface, makes judge input caps configurable, folds Cap A GUI feedback into real recall-context buckets while preserving synthetic judge alignment, adds a read-only release/eval gate, and adds shadow GP+EI fusion proposals. |

| **v0.28.3** (2026-05-01) | ARS dynamic scalar expansion | Extends policy-gated dynamic adoption beyond recall fusion: synthesis/concept cold-start and useful-rate thresholds can move from static values toward calibrated feedback, judge sample rates adapt under the same policy gate, shadow judge jobs carry deterministic `signal_hint` evidence, and shadow replay evaluates blended simplex candidates instead of one-hot-only weights. |

| **v0.28.2** (2026-05-01) | ARS dynamic parameter policy | Adds `ars_parameter_policy` metadata activation, trust-weighted static-to-learned fusion adoption, κ/drift-gated LLM judge `weight_decay_rate`, `/api/adaptive` policy status, and `rein doctor` policy health checks. |

| **v0.28.1** (2026-04-30) | ARS recall canary activation | Persists replay-learned global/query-type/cluster six-dimensional fusion weights in `AdaptiveState.learned_shadow_fusion`. Defaults remain `enabled = false`, `shadow_only = true`; setting `enabled = true` plus `shadow_only = false` lets recall rescore live-filtered candidates with learned BM25/vector/KG/episode/support/diversity weights. |

| **v0.28.0** (2026-04-30) | ARS acceleration groundwork | Default-off, shadow-first acceleration controller. `[ars.acceleration].enabled = false` by default; `/api/adaptive` exposes `ars_acceleration.shadow_fusion_replay` with bounded `enabled`, `shadow_only`, `status`, `replay_limit`, `eligible_samples`, `min_samples`, `global`, `by_query_type`, and `by_cluster` preview fields. Production recall scoring and ARS behavior were unchanged in this release. |

| **v0.27.6** (2026-04-30) | Codex hook parity + deployment hardening | Adds Codex `session-start`, `pre`, and `permission` hook commands alongside existing `post`, `compact`, `prompt`, and `stop`; emits official `hookSpecificOutput.additionalContext` for opted-in session/prompt context; applies conservative deny-only shell guardrails; teaches `rein init` and `rein doctor` to configure and validate all six Codex events. Deployed to Mac mini with launchd `zsh -l -c` wrappers and Homebrew Rust toolchain. |

| **v0.27.5** (2026-04-29) | R10-residual cleanup | Cold archive too-large backoff (`last_too_large_at` + claim_batch ORDER BY); Cap A 4096-bucket LRU eviction; cron `cron_claims` pre-LLM dedup with claim_token ownership + 5-min stale takeover + post-claim TOCTOU re-check + post-emit-crash reaper. **10 codex review rounds saturated (R6 + R10 fully clean).** 1035 lib tests / 0 clippy / 0 fmt. |

| **v0.27.4** (2026-04-29) | audit-team remediation | 5-agent disjoint-slice fan-out closed 1 CRIT + 8 HIGH + 9 MED + 5 LOW from a v0.27.3 audit, then 10 codex rounds drove P1 to 0. Headline: **C1** `[server,proxy].allow_unauthenticated_loopback` default flipped `true → false`; **E2** M5 strip post-COMMIT side-index discipline; **D1+D2** SHA-256-prefix synthetic `cluster_id` for Cap A bucket alignment. 1265 tests. |

| **v0.27.3** (2026-04-28) | full-audit remediation | Closes the v0.27.0/.1/.2 implementation audit. Released to GitHub. |

| **v0.27.2** (2026-04-27) | judge ledger / cache reaper | `judge_call_ledger` daily-cap reservation shared across runtime + cron (R9-K1); judge cache reaper; `judge_model_override` extractor swap; doctor judge checks. |

| **v0.27.1** (2026-04-27) | E direction — runtime LLM judge | Opt-in via `[ars.llm_judge].enabled = false`. Hooks at synthesis (Cap B) and concept-summary (Cap A) mint time so MCP-only deployments still produce adaptive feedback without GUI dwell/click. **7-invariant judge contract J1-J7** (stamp-time payload, atomic `reserve_call`, worker-pull, cache rehydration). New MCP tools `rein_judge_synthesis` + `rein_judge_concept_summary`. `[llm]` config inheritance with `provider = "inherit"` sentinel. |

| **v0.27.0** (2026-04-26) | Cap A mirror feedback + fact-layer dedup | `rein_feedback_concept_summary` mirrors Cap B's loop onto concept living-summary. Triple extraction + N-memory merge + temporal supersede direction. |

| **v0.26.2** (2026-04-26) | 32-bug security + correctness hotfix | 8 HIGH + 8 MEDIUM from a user-driven Codex audit on v0.26.1, plus 16 audit-cycle additions across 11 follow-up codex review rounds. Auth default-deny via `http_request_needs_auth(method, path, gui_enabled)`. Recall correctness with status-aware SQL filters + canonical-first preservation of superseded rows. `apply_evolution` side-index discipline. Backend↔GUI synthesis bucket round-trip. `update()` archival lifecycle clears archival_summary cols on semantic content change. 1002 tests. |

| **v0.26.1** (2026-04-25) | D direction wiring fix + cold_archive eval | v0.26.0 hardcoded `query_type = "Semantic"` made the per-cluster gate dead code for 5 of 6 query types; fixed by routing real `QueryType::synthesis_bucket_label()` through MCP/CLI/REST. `[ars].synthesis_cold_start_n` config (default 10). `rein-eval cold_archive {baseline,run,compare}` subcommand. |

| **v0.26.0** (2026-04-25) | ARS Cap C + D direction full vertical | Cap C cold-tier archival summary (`rein_archive_summary_refresh` MCP tool, slow-channel worker with 5-way CAS + 3-invariant lossless contract). D direction event-sourced loop: `SynthesisInteraction` event → `synthesis_feedback` M1 consumer → per-query adaptive synthesis-decision gate (`decide_synthesize`) surfaced via REST/MCP/GUI. |

| **v0.25.x** (2026-04-24/25) | ARS Cap B + Synthesis Lab | Opt-in recall-time LLM narrative synthesis: `rein_recall` extended with `synthesize=true` (no new MCP tool added). `rein-eval synthesis` McNemar harness. Synthesis Lab GUI page (`/synthesis-lab`) with editable evidence + dwell/click telemetry. v0.25.2 hybrid hit-checker (Snowball Porter2 stem + Gemini cosine fallback). v0.25.3 LLM-judged hit checker (`REIN_EVAL_JUDGE=llm`). |

| **v0.24.0** (2026-04-24) | ARS Cap A — concept living-summary | Per-concept rolling LLM summary refreshed via L3 adaptive policy (revision_p75 + age_p50) + L4 concurrent CAS. Cross-cutting peek+commit refactor across 5 consumer offsets. New MCP tools `rein_concept_state` + `rein_concept_summary_refresh`. 819 tests. |

| **v0.23.0** (2026-04-23) | Resummerize + 7-invariant Lossless Compression Contract | LLM-driven canonical recompression at the 10 KB `MergeInto` cap (replaces v0.21 keep-tail truncation). Atomic `apply_resummerize` with 5-way CAS + 3-strike exhaustion fuse + 5-minute stale-claim takeover. Paired `rein-eval` McNemar non-inferiority test. 750 tests. |

| **v0.22.0** (2026-04-22) | KG pool + service wiring + try_get fast-path | 675 tests / 7 codex audit rounds. |

| **v0.21.0** (2026-04-20) | A1 Operation Registry | `#[op]` proc-macro: each operation authored **once** in source, dispatched via `inventory` to thin CLI / MCP / REST adapters. Eliminated three parallel hand-maintained registries. 625 tests. |

v0.32.0 keeps the v0.28.7 / v0.28.8 default surface unchanged: only `[ars.acceleration]` ships default-on (still fail-closed — learned parameters do not affect runtime until a healthy `ars_parameter_policy` promotes a canary with positive scoped adoption weights). The runtime LLM judge (`[ars.llm_judge]`) and its `nightly_cron` remain default-off per the v0.28 charter Non-Goal — operators must explicitly opt in (incurs LLM API spend). ARS content-generation features (`[ars].concept_summary_enabled`, `recall_synthesis_enabled`, `cold_archive_enabled`) and `[resummerize].enabled` remain operator-controlled. The new `[server].auth` policy added in v0.30.0 stays the explicit gate for HTTP/SSE exposure — `loopback_only` is the default, and remote-MCP deployments require an explicit choice between `bearer_required`, `oauth`, or `public`.

### Architecture Diagrams

#### Memory Storage Flow

```mermaid

flowchart TD

    A[Input text / tool output] --> B[hook_post or rein_store]

    B --> C[LLM Extraction\nGemini Flash Lite / OMLX]

    C -->|LLM unavailable| D[Rule-based fallback\ntopic · summary · keywords · importance]

    C --> D2[postprocess\ndate detection · preference tagging]

    D --> D2

    D2 --> E{store_with_dedup\nBEGIN IMMEDIATE}

    E -->|sim ≥ cluster_threshold A1| F[Provenance-preserving merge\nloser → evidence record]

    E -->|sim in gray-zone| G[LLM dedup verdict\nasync dedup-queue]

    E -->|new memory| H[INSERT memories]

    H --> I[auto_link\nbidirectional related_ids]

    I --> J[evolve\nknowledge evolution]

    J --> K[HNSW + Tantivy index update\nfire-and-forget]

    K --> L[needs_vec_dedup flag\nfor GC slow-channel embedding dedup]

    F --> M[dedup_decisions ledger]

    G --> M

```

#### Recall Pipeline

```mermaid

flowchart TD

    Q[Query] --> CL[Query Classifier\n6 strategies · rule-based · 0 LLM calls]

    CL -->|strategy + alpha| EX[Query Expansion\nGemini / OMLX → 2-3 variants]

    EX --> P1[Channel 1: Tantivy BM25\nlocal · <1ms]

    EX --> P2[Channel 2: HNSW vector\nlocal ~5ms / Gemini API ~255ms]

    EX --> P3[Channel 3: KG FTS + BFS\nconcept land-and-expand]

    P1 --> FU[RRF / CC Fusion\nlearned alpha M2]

    P2 --> FU

    P3 --> FU

    FU --> TF[M5 Tier Filter\nCold excluded for non-Exploratory]

    TF --> SW[Strength Weighting\nper-cluster KM curve M3 → global prior → Ebbinghaus]

    SW --> RF[Multi-feature Rerank\n8 features · learned weights]

    RF -->|optional| LR[LLM Reranker\nGemini / OMLX · strong-signal bypass]

    RF --> CC[Canonical-first collapse\nevidence_preview attached]

    LR --> CC

    CC --> CV[Cross-validate\nSupermemory + auto-memory files]

    CV --> RES[Final results\nconfidence 95%/85%/62% by source count]

```

#### Compression (PreCompact Hook)

```mermaid

flowchart TD

    T[PreCompact trigger\nContext window approaching limit] --> HC[hook_compact\nrecord compact context]

    HC --> SB[Read session buffer\naccumulated tool outputs + turns]

    SB --> LE[LLM extraction\nmemories + concepts + links]

    LE --> WQ[Async memory queue\n~/.rein/memory_queue_.jsonl]

    WQ --> BW[Background worker\nrein worker memory]

    BW --> SD[store_with_dedup\nper-memory dedup + merge]

    SD --> EP[Episode node created\nsession → concept_ids + memory_ids]

    EP --> TL[ConceptLink temporal validity updated\nvalid_from / valid_until]

    TL --> CL[Session buffer cleared\nready for next context window]

    style T fill:#f96,color:#000

    style EP fill:#6af,color:#000

```

---

### Configuration

rein loads configuration with the following priority (highest wins):

1. Environment variables

2. TOML config file (`$REIN_CONFIG` or `~/.config/rein/config.toml`)

3. Compiled-in defaults

#### Environment Variables

| Variable | Description |

|----------|-------------|

| `GEMINI_API_KEY` | Google Gemini API key for embeddings |

| `SUPERMEMORY_CC_API_KEY` | Supermemory API key for cross-validation |

| `REIN_HTTP_TOKEN` | Bearer token for non-localhost HTTP/SSE access |

| `REIN_DB` | Override database path |

| `REIN_CONFIG` | Override config file path |

| `REIN_LOG` | Log level filter (e.g. `debug`, `info`, `warn`) |

| `REIN_PROXY_BIND` | Override proxy bind address |

| `REIN_PROXY_PORT` | Override proxy port |

| `REIN_SSE_BIND` | Override SSE/HTTP bind address (default `127.0.0.1`) |

| `REIN_SSE_PORT` | Override SSE/HTTP port (default `8680`) |

| `REIN_PROXY_TOKEN` | Bearer token for non-localhost proxy access |

#### config.toml

```toml

[database]

path = "auto"                          # "auto" = ~/.rein/memories.db

[embedding]

provider = "google"    # or "omlx" or "none"

dimensions = 3072

[embedding.google]

model = "gemini-embedding-001"

[embedding.omlx]

endpoint = "http://localhost:8000/v1"

model = "default"

[search]

rrf_k = 60.0

rrf_fts_weight = 0.3

rrf_vec_weight = 0.7

fusion_method = "rrf"      # or "cc" (Convex Combination, Bruch 2023)

cc_alpha = 0.5             # CC blend: alpha * sparse + (1-alpha) * dense

dedup_similarity = 0.70    # uses max(jaccard, containment) similarity

dedup_time_window_days = 7

[chunking]

max_tokens = 512

overlap_percent = 10

metadata_prefix = true

[sync]

supermemory_enabled = true

auto_memory_enabled = true

auto_memory_glob = "~/.claude/projects/*/memory/**/*.md"

[decay]

base_lambda = 0.06

ltm_beta = 0.8

stm_beta = 1.2

interval_hours = 24

prune_threshold = 0.05

stm_to_ltm_access_count = 5

[server]

compact = false

sse_enabled = false

sse_port = 8680

sse_bind = "127.0.0.1"

```

### Database

The database is stored at `~/.rein/memories.db` by default. rein auto-migrates from the old location if needed.

Override with the `REIN_DB` environment variable or the `[database] path` config key.

### Hook Setup for Claude Code

Add the following to your Claude Code `settings.json` to enable automatic memory extraction:

```json

{

  "hooks": {

    "PostToolUse": [

      {

        "matcher": "",

        "hooks": [

          { "type": "command", "command": "rein hook post", "timeout": 10 }

        ]

      }

    ],

    "PreCompact": [

      {

        "matcher": "",

        "hooks": [

          { "type": "command", "command": "rein hook compact", "timeout": 10 }

        ]

      }

    ],

    "Stop": [

      {

        "matcher": "",

        "hooks": [

          { "type": "command", "command": "rein hook stop", "timeout": 30 }

        ]

      }

    ]

  }

}

```

**Hook behavior:**

- `PostToolUse` -- local pattern extraction (crash safety net) + buffers for session-end batch processing

- `PreCompact` -- records compact context for the async memory pipeline

- `Stop` -- queues full knowledge extraction: memories + concepts + links + episode summary via async worker

### Hook Setup for Codex CLI

Codex CLI hooks require `hooks = true` (Codex 0.129+) and either `~/.codex/hooks.json`

or inline `[hooks]` tables in `~/.codex/config.toml`.

`rein init` now configures the Codex MCP entry and installs the Rein hooks:

```toml

[features]

hooks = true

```

```json

{

  "hooks": {

    "SessionStart": [

      {

        "matcher": "*",

        "hooks": [

          { "type": "command", "command": "REIN_AGENT_LABEL=codex rein hook session-start", "timeout": 5 }

        ]

      }

    ],

    "PreToolUse": [

      {

        "matcher": "*",

        "hooks": [

          { "type": "command", "command": "REIN_AGENT_LABEL=codex rein hook pre", "timeout": 5 }

        ]

      }

    ],

    "PermissionRequest": [

      {

        "matcher": "*",

        "hooks": [

          { "type": "command", "command": "REIN_AGENT_LABEL=codex rein hook permission", "timeout": 5 }

        ]

      }

    ],

    "PostToolUse": [

      {

        "matcher": "*",

        "hooks": [

          { "type": "command", "command": "REIN_AGENT_LABEL=codex rein hook post", "timeout": 10 }

        ]

      }

    ],

    "UserPromptSubmit": [

      {

        "hooks": [

          { "type": "command", "command": "REIN_AGENT_LABEL=codex rein hook prompt", "timeout": 5 }

        ]

      }

    ],

    "Stop": [

      {

        "hooks": [

          { "type": "command", "command": "REIN_AGENT_LABEL=codex rein hook stop", "timeout": 30 }

        ]

      }

    ]

  }

}

```

The Codex hook payload differs from Claude Code's payload. Rein understands the

official Codex fields (`hook_event_name`, `tool_input`, `tool_response`,

`prompt`, `last_assistant_message`, and `transcript_path`). `PostToolUse` and

`Stop` feed the same async memory pipeline used by Claude Code hooks.

`PreToolUse` and `PermissionRequest` are deny-only guardrails. `SessionStart`

and `UserPromptSubmit` can emit official Codex `additionalContext` JSON when

explicitly enabled:

```toml

[hooks.codex]

inject_prompt_context = true

inject_session_context = true

max_additional_context_chars = 4000

```

### Remote Access via HTTP/SSE

Start rein with SSE transport for remote or multi-client access:

```bash

rein serve --sse

```

By default, the server binds to `127.0.0.1:8680`.

To bind to a non-localhost address, you **must** set the `REIN_HTTP_TOKEN` environment variable for bearer token authentication:

```bash

export REIN_HTTP_TOKEN="your-secret-token"

```

Configure bind address and port in `config.toml`:

```toml

[server]

sse_enabled = true

sse_port = 8680

sse_bind = "0.0.0.0"    # requires REIN_HTTP_TOKEN

```

### Transparent Proxy (v0.10.0)

rein can run as a transparent HTTP proxy that records LLM conversations without modifying requests. This works with any agent that supports base URL override.

#### Quick Start

```bash

# 1. Start the proxy (background)

rein serve --proxy &

# 2. Use with your agent

ANTHROPIC_BASE_URL=http://127.0.0.1:8690 claude       # Claude Code

codex -c 'model_providers.rein_proxy={ name = "Rein Proxy", base_url = "http://127.0.0.1:8690/v1", env_key = "OPENAI_API_KEY", wire_api = "responses", supports_websockets = false, env_http_headers = { "x-rein-token" = "REIN_PROXY_TOKEN" } }' -c 'model_provider="rein_proxy"'

```

#### Shell Aliases (recommended)

Add to `~/.zshrc` or `~/.bashrc` for convenience:

```bash

alias rein-proxy="rein serve --proxy &"

claudep() { REIN_PROXY_ACTIVE=1 ANTHROPIC_BASE_URL=http://127.0.0.1:8690 ANTHROPIC_CUSTOM_HEADERS="x-rein-token: ${REIN_PROXY_TOKEN:-}" claude "$@"; }

codexp() { REIN_PROXY_ACTIVE=1 codex -c 'model_providers.rein_proxy={ name = "Rein Proxy", base_url = "http://127.0.0.1:8690/v1", env_key = "OPENAI_API_KEY", wire_api = "responses", supports_websockets = false, env_http_headers = { "x-rein-token" = "REIN_PROXY_TOKEN" } }' -c 'model_provider="rein_proxy"' "$@"; }

codexsubp() { REIN_PROXY_ACTIVE=1 codex -c 'model_providers.rein_sub_proxy={ name = "Rein Subscription Proxy", base_url = "http://127.0.0.1:8690", requires_openai_auth = true, wire_api = "responses", supports_websockets = false }' -c 'model_provider="rein_sub_proxy"' -c 'chatgpt_base_url="http://127.0.0.1:8690/backend-api"' "$@"; }

codexsubpws() { REIN_PROXY_ACTIVE=1 codex -c 'model_providers.rein_sub_proxy_ws={ name = "Rein Subscription Proxy WS", base_url = "http://127.0.0.1:8690", requires_openai_auth = true, wire_api = "responses", supports_websockets = true }' -c 'model_provider="rein_sub_proxy_ws"' -c 'chatgpt_base_url="http://127.0.0.1:8690/backend-api"' "$@"; }

```

Then: `rein-proxy` to start, `claudep`, `codexp`, `codexsubp`, or `codexsubpws` to use. For ChatGPT-login Codex, `codexsubp` remains the recommended loopback entrypoint; smoke it with `./scripts/smoke_codexsubp.sh`. For the websocket-enabled path, use `codexsubpws` or `./scripts/smoke_codexsubp_ws.sh`.

The `codexsubp`/`codexsubpws` provider overrides are generated from `scripts/codexsubp_provider.toml.tmpl`, which is the single source of truth for `requires_openai_auth = true`.

#### Codex CLI Config (alternative)

Configure Codex CLI permanently in `~/.codex/config.toml` using a custom provider:

```toml

[model_providers.rein_proxy]

name = "Rein Proxy"

base_url = "http://127.0.0.1:8690/v1"

env_key = "OPENAI_API_KEY"

wire_api = "responses"

supports_websockets = false

env_http_headers = { "x-rein-token" = "REIN_PROXY_TOKEN" }

model_provider = "rein_proxy"

```

This makes all Codex calls go through the rein proxy by default (requires proxy to be running).

#### Supported Agents

| Agent | Configuration | Format |

|-------|--------------|--------|

| **Claude Code** | `ANTHROPIC_BASE_URL=http://127.0.0.1:8690` | Anthropic `/v1/messages` |

| **Codex CLI** | `codexp` shell function or custom `model_provider` in `~/.codex/config.toml` | OpenAI `/responses` |

| **Codex CLI (ChatGPT login)** | `codexsubp` shell function or `./scripts/smoke_codexsubp.sh` for smoke testing | ChatGPT first-party (`/responses`, `/models`, `/responses/compact`, `/memories/trace_summarize`, `/wham/*`, `/connectors/*`) |

| **Codex CLI (ChatGPT login, experimental WS-first)** | `codexsubpws` shell function or `./scripts/smoke_codexsubp_ws.sh` | Same first-party routes, but starts with websocket transport and relies on local `426` fallback when needed |

| **Cursor** | Settings > Override OpenAI Base URL | OpenAI `/v1/chat/completions` |

| **Windsurf** | Settings > Custom API Endpoint | OpenAI `/v1/chat/completions` |

| **Any OpenAI-compatible** | `OPENAI_BASE_URL=http://127.0.0.1:8690` | OpenAI `/v1/chat/completions` |

> **Note:** Codex subscription/OAuth login proxying is not the same as the API-key Responses API proxy above. For API-key Codex, keep using `codexp`. For ChatGPT-login Codex, `codexsubp` is still the recommended loopback entrypoint today: it keeps `requires_openai_auth = true`, points `chatgpt_base_url` at the local rein proxy, and disables websocket transport so the first-party backend stays on the local record-only path. rein now also has an experimental websocket-enabled path (`codexsubpws` / `smoke_codexsubp_ws.sh`) that starts with websocket transport and relies on local `426 Upgrade Required` fallback when upstream websocket is unavailable.

For ChatGPT-login Codex on loopback, `codexsubp` is the practical path today. It uses a custom provider with `requires_openai_auth = true` so Codex still uses ChatGPT login, but the provider itself points to the local rein proxy and disables websocket transport. `chatgpt_base_url` is also pointed at the local proxy so helper/discovery traffic (`/wham/*`, `/connectors/*`, `/v1/agent/register`, etc.) follows the same path. This keeps the subscription-login flow working over HTTP while the broader websocket and matrix automation work is hardened.

Even when a client attempts websocket upgrade directly, rein only upgrades the structured-text `/responses` path; non-`/responses` first-party routes stay on ordinary HTTP and retain their `artifact-mirror-only` behavior.

#### How it works

- Proxy intercepts `/v1/messages` (Anthropic), `/v1/chat/completions` (OpenAI), `/responses` / `/v1/responses` (Codex / OpenAI Responses API), transparently forwards `/backend-api/codex/*` (Codex first-party backend), and routes ChatGPT helper/discovery paths such as `/wham/*`, `/connectors/*`, `/v1/agent/register`, `/authenticate_app_v2`, and `/codex/safety/arc` to the ChatGPT backend root

- Requests are forwarded **unmodified** (record-only, no injection)

- Assistant responses are asynchronously extracted and stored as memories on the standard public path and on first-party Codex `/responses`; other first-party routes stay `artifact-mirror-only` and are mirrored as raw artifacts without structured extraction

- SSE streaming is passed through byte-for-byte with zero latency impact

- Dedicated blocking thread with resident SqliteStore for extraction

- Other endpoints (e.g. `/v1/models`) are passed through unmodified

#### Configuration

```toml

[proxy]

port = 8690

bind = "127.0.0.1"

anthropic_upstream = "https://api.anthropic.com"

openai_upstream = "https://api.openai.com"

chatgpt_upstream = "https://chatgpt.com/backend-api"

codex_upstream = "https://chatgpt.com/backend-api"

extract_enabled = true    # record memories from responses

store_min_chars = 220     # skip short responses

store_min_score = 3       # quality threshold for extraction

```

**Security:** Non-localhost binds require `REIN_PROXY_TOKEN`. Auth headers are forwarded opaquely and never logged.

### Async Memory Pipeline (v0.10.0)

Memory extraction is now fully asynchronous. Hooks queue jobs to a file-based queue, and a background worker processes them with LLM extraction, dedup, and persistence.

```bash

# Manually drain the queue (usually automatic via spawn)

rein worker memory

```

**Architecture:**

- `hook_post` / `hook_compact` / `hook_stop` queue jobs to `~/.rein/memory_queue_.jsonl`

- Background worker (`rein worker memory`) processes jobs with exponential backoff and dead-lettering

- Cross-session dedup via fingerprint + content similarity

- **Working set** — project-scoped memory surface updated on each extraction

- **Always-on index** — stable, high-quality summaries for project-level context

**Configuration:**

```toml

[async_memory]

max_retries = 3

base_backoff_ms = 2000

max_jobs_per_run = 32

batch_size = 8

spawn_cooldown_ms = 1500

max_working_set_items = 40

max_always_on_items = 24

```

### Neural Wiki GUI (v0.11.0)

rein includes a built-in web GUI for visual exploration of your memory system. The GUI is embedded in the binary via `rust-embed` — no separate web server needed.

#### Quick Start

```bash

# Build with GUI support

cd crates/rein/gui && npm ci && npm run build && cd ../../..

cargo install --path crates/rein --locked --features gui

# Start the server with GUI enabled (implies --sse)

rein serve --gui

# Open in browser

open http://localhost:8680

```

The GUI is available at `http://localhost:8680/` when running with `--gui`. API endpoints are at `/api/*` and MCP at `/mcp`.

#### Pages

| Page | Description |

|------|-------------|

| **Dashboard** | Overview stats, recent memories with tier badges (Hot/Warm/Cold) |

| **Brain View** | "Neon Neurons" force-directed graph of all memories — tier-colored glowing nodes, search highlight, time slider |

| **Memories** | Card grid with search, topic/tier filters, detail slide-over panel, delete with confirmation |

| **Adaptive Engine** | 6-panel dashboard: learned alpha values, tier distribution, 17-feature reranker weights, event counts, K-M survival curves, cluster stats |

| **Knowledge Graph** | Per-memoir force-directed concept graph with relation-colored edges, concept inspection panel |

| **Timeline** | Date-range filtered chronological view of episodes and memory events |

| **Artifacts** | Session transcript viewer with turn-by-turn styling |

| **Settings** | Polling interval (1-60s), auth token input |

#### Authentication

API endpoints (`/api/*`, `/mcp`) require a bearer token when `REIN_HTTP_TOKEN` is set. The GUI itself is served without auth so the SPA can bootstrap and show a token input dialog. Set the token in the Settings page. The v1.0 `/v1/*` alias mirrors `/api/*` for programmatic clients but accepts **only** a header token (`Authorization: Bearer …` / `x-rein-token`), not the browser session cookie (which is scoped to `Path=/api`); browser/GUI clients keep using `/api`.

#### Configuration

```toml

[server]

gui_enabled = false    # enable GUI (or use --gui flag)

sse_port = 8680        # port for HTTP/SSE/GUI

sse_bind = "127.0.0.1" # bind address

```

#### Development

The frontend source lives in `gui/` (React 18 + TypeScript + Tailwind + Vite).

```bash

cd gui

npm install

npm run dev    # Dev server at localhost:5173, proxies API to localhost:8680

npm run build  # Build to gui/dist/ (embedded by rust-embed at compile time)

```

### Architecture

```mermaid

flowchart TD

    U[User / AI Agent]

    CLI[CLI\n20+ commands]

    MCP[MCP Server\n40 tools · stdio / HTTP / SSE]

    GUI[Neural Wiki GUI\nReact + Tailwind]

    PXY[Proxy\nClaude · Codex subscription · record-only]

    U --> CLI

    U --> MCP

    U --> GUI

    U --> PXY

    CORE[rein core]

    CLI --> CORE

    MCP --> CORE

    GUI -->|inventory-backed REST API| CORE

    PXY -.->|async queue| CORE

    REC[Recall Pipeline\n3-channel + RRF/CC + rerank + canonical-first]

    ST[Store · Dedup · Evolve\nauto-link · provenance-preserving merge]

    HK[Hooks\npost · compact · stop]

    ADP[Adaptive Engine\nM1-M6 + A1]

    KG[Knowledge Graph\nmemoir · concept · episode · temporal links]

    CORE --> REC

    CORE --> ST

    CORE --> HK

    CORE --> ADP

    CORE --> KG

    DB[(SQLite memories.db\nmemories · FTS5 · sqlite-vec)]

    TN[Tantivy BM25 side index]

    US[usearch HNSW side index]

    REC --> DB

    ST --> DB

    HK --> ST

    ADP --> DB

    KG --> DB

    ST -.fire-and-forget.-> TN

    ST -.fire-and-forget.-> US

    REC -.reads.-> TN

    REC -.reads.-> US

    style DB fill:#6af,color:#000

    style CORE fill:#f96,color:#000

```

**Storage is the single source of truth** (`memories.db`): SQLite with FTS5 + sqlite-vec. Tantivy and usearch side indexes are derived, auto-rebuilt, and queried by the recall pipeline — storage writes update them fire-and-forget so hot-path latency stays unaffected.

#### Search Pipeline

Two independent search paths run in parallel, then merge:

**Text path:**

1. **Tantivy BM25** -- full-text search with BM25 ranking (falls back to FTS5 if Tantivy unavailable)

**Vector path:**

2. **Cache check** -- look up query embedding in local cache (keyed by model + query)

3. **HNSW search** -- O(log n) approximate nearest neighbor via usearch (falls back to sqlite-vec)

4. If cache miss: **Embed API** -- call Google gemini-embedding-001 or OMLX, cache result, then HNSW search

**Merge:**

5. **RRF/CC fusion** -- Reciprocal Rank Fusion or Convex Combination merges text + vector results (path quality gating excludes empty paths)

6. **Adaptive scoring** -- Per-cluster Kaplan-Meier survival curves (or Ebbinghaus cold-start fallback) weight final ranking + temporal filtering

7. **Cross-validation** -- compare with Supermemory + auto-memory results, assign confidence

#### Embedding Backends

rein uses an `EmbedderKind` enum dispatch to support multiple embedding backends:

- **Google** (`gemini-embedding-001`) -- default, 3072 dimensions; provider benchmark details are documented in `docs/reference/bibliography.md`

- **OMLX** -- local embedding via OpenAI-compatible API endpoint

Set `[embedding] provider` to `"google"`, `"omlx"`, or `"none"` in config.

#### Proxy / Endpoint Override

For users in China or behind firewalls, all API endpoints are configurable:

**Direct proxy (Cloudflare Worker, Nginx reverse proxy):**

```toml

[embedding.google]

endpoint = "https://your-gemini-proxy.com"

# Requests: {endpoint}/v1beta/models/gemini-embedding-001:embedContent

[sync]

endpoint = "https://your-supermemory-proxy.com"

```

**OpenRouter or other OpenAI-compatible aggregators:**

```toml

[embedding]

provider = "omlx"

[embedding.omlx]

endpoint = "https://openrouter.ai/api/v1"

model = "google/gemini-embedding-001"

```

This works because the OMLX backend uses the OpenAI `/v1/embeddings` format, which is compatible with OpenRouter, LiteLLM, and similar services.

#### Memory Decay Model

- **Critical** memories never decay (strength = 1.0 forever)

- **STM** (Short-Term Memory): faster decay (beta = 1.2), promoted to LTM via cluster survival curve (fallback: 5 accesses)

- **LTM** (Long-Term Memory): slower decay (beta = 0.8), assigned to high / critical importance

- Access count slows decay: `lambda_eff = lambda / (1 + access_count * 0.2)`

### Supported Clients

`rein init` auto-detects and configures:

- Claude Code

- Claude Desktop

- Cursor

- Windsurf

- VS Code (Copilot)

- Gemini CLI

- Codex

- OpenCode

### Performance Targets

| Metric | Target |

|--------|--------|

| Tantivy BM25 search | < 1 ms |

| HNSW ANN search | < 1 ms |

| FTS5 fallback search | < 1 ms |

| Vector search (cached) | < 1 ms |

| Vector search (API) | < 300 ms |

| Store (with dedup) | < 5 ms |

| Memory footprint | 2-5 MB |

| Binary size (release) | ~13 MB (CLI), ~16 MB (with GUI) |

### Cost Estimate

| Component | Free tier | Cost at scale |

|-----------|-----------|---------------|

| gemini-embedding-001 | 1500 req/day | ~$0.00 |

| Supermemory | Optional | Free tier available |

| SQLite storage | Local | $0.00 |

| **Total** | **$0.00/month** | **< $0.03/month** |

### License

**Copyright (C) 2026 lyr1cs.** All rights reserved except as licensed under AGPL-3.0-or-later.

**AGPL-3.0-or-later** — see [LICENSE](LICENSE).

rein is a server (MCP / REST / GUI). The AGPL §13 network-use clause means: if you run a modified version of rein **as a service that users interact with over a network**, you must provide those users access to the modified source code. Self-hosted personal use, internal-only deployment within your organization, and integrations that talk to rein over its public API (Claude Code, Cursor, IDE plugins, etc.) are all unaffected.

If you need a non-AGPL license for commercial / proprietary use, the project's copyright holder (`lyr1cs`) retains the right to dual-license — open an issue.

---

中文


### 项目简介

rein 是一个自适应记忆系统，专为 AI 编程智能体设计。它跨会话存储、检索和管理记忆，通过反馈事件和慢通道学习逐步减少固定参数。

**当前版本：`v1.0.0`**（2026-05-31）— 1.0 freeze。在 v0.38 schema 版本化地基之上，v1.0 承诺稳定面：baseline schema 在前向迁移框架后**冻结**（回归测试守护）、**40 个 MCP 工具 arg-schema 钉死**、**MSRV 钉到 Rust 1.86**（CI 门控）、config 加 **`config_version`** 键 + load-time downgrade guard、**`/v1/*` REST 版本化别名**（header-token 认证；GUI session cookie 仍 `Path=/api`）与 `/api` 并存。头条功能 **#A5 三元组持久化**——框架第一条真实前向迁移（`memory_triples` 事实表，schema version 2），由默认关的 `[dedup].persist_triples` flag 门控；默认召回/dedup 路径与 v0.38 逐位一致。MSRV 诚实性修复：三处 `is_multiple_of`（Rust 1.87）改 `%`。1470 lib 测试通过；clippy + fmt clean；agent-team 审计（PII / 正确性 / hygiene / docs）clean。License: AGPL-3.0-or-later。详见下方[最近版本](#最近版本)。

完整英文 manual 见 [docs/manual/README.md](docs/manual/README.md)，引用表和命令/API 速查见 [docs/reference/](docs/reference/)。

### 核心特性

| 特性 | 说明 |

|------|------|

| **40 个 MCP 工具** | 核心记忆操作、知识图谱、时序召回、自适应维护、ARS 反馈（Cap A 镜像、Cap B 合成、Cap C 归档摘要）、runtime LLM judge 入队、ARS acceleration release-gate 检查，以及 Trust & Measurement 报告。所有操作通过 `#[op]` 宏（v0.21+）单点声明，CLI / MCP / REST 三端共用。 |

| **自适应引擎** | M1-M6 + A1：事件溯源 → 反事实 alpha 学习 → KM 生存曲线 → HDBSCAN 聚类 → 三层分级 → 阈值探索 |

| **反事实 Alpha 优化** | 回放历史 recall，学习全局 / 按查询类型 / **按聚类** 的最优 CC 融合权重（M2） |

| **Per-cluster KM 衰减 + 全局先验** | Kaplan-Meier 生存曲线替代固定遗忘曲线；全局先验曲线覆盖冷启动新聚类（M3） |

| **HDBSCAN 语义聚类** | 纯 Rust 实现，dendrogram → 凝聚树 → EOMBST，大数据自动采样（M4） |

| **Hot/Warm/Cold 分层** | 流式分位数估计器 + cold_archive 迁移（M5） |

| **自适应去重阈值（A1）** | 全链路落地：store / batch / vec dedup 均使用 per-cluster P90 阈值，0.70 全局兜底 |

| **保留来源的去重** | 合并时保留时间锚点和独特细节，不丢失信息 |

| **嵌入语义去重** | 向量相似度捕捉文本相似度遗漏的改写，GC 慢通道执行 |

| **时序知识图谱** | Memoir / Concept / ConceptLink，9 种关系类型，修订历史，Episode 节点，时间窗口 |

| **自主检索路由** | 规则分类器，6 种策略：Episodic / Temporal / Preference / ExactKeyword / Semantic / Exploratory（零 LLM 调用） |

| **查询扩写** | LLM 将查询改写为 2-3 个变体（Gemini Flash Lite / OMLX），多路结果融合前合并 |

| **LLM 重排序** | Gemini / OMLX 对 top-N 候选再评分，高置信度时绕过（strong-signal bypass） |

| **最大边际相关性（MMR）** | 重排序后多样性 pass，平衡相关性与结果多样性 |

| **OMLX 本地嵌入** | 可选本地嵌入后端（Google / OMLX） |

| **双路搜索** | Tantivy BM25 + HNSW ANN → RRF/CC 融合（学到的权重） |

| **多源交叉验证** | 3 个来源（本地、Hook 提取、Supermemory）+ 置信度评分 |

| **多因子准入控制** | A-MAC 2026：llm_conf + novelty + type_prior + recency 评分 |

| **语义分块** | 按标题/段落/句子分割，嵌入时附加元数据前缀 |

| **Tantivy + FTS5 文本搜索** | Tantivy BM25 旁路索引 + SQLite FTS5 兜底；CJK 词法路径由 jieba-rs + 字符 bigrams 覆盖 |

| **Supermemory v4 API** | 通过 `api.supermemory.ai/v4/search` 进行混合搜索交叉验证 |

| **零本地模型** | 默认无需 GPU（可选 OMLX 本地后端） |

| **~2-5 MB 占用** | 单个 SQLite 文件 + FTS5 + sqlite-vec |

| **gemini-embedding-001** | 默认 Google embedding 模型，3072 维；benchmark 说法按 provider 文档和 bibliography 标注 |

| **20+ CLI 命令** | MCP 工具的全部功能，另加 init、config、migrate、hooks、recent、gc、organize、upgrade |

| **自动配置** | `rein init` 自动检测并配置 8 个 MCP 客户端 |

| **Neural Wiki GUI** | React + Tailwind Web 仪表盘：Brain View、Adaptive Engine、Knowledge Graph、Timeline 等 |

| **混合 CJK 去重分词** | jieba-rs 中文分词 + 字符 bigrams，覆盖中日韩文本的去重和搜索 |

| **Per-cluster 准入控制** | 准入阈值和新颖度计算感知 HDBSCAN 聚类上下文 |

| **Evidence 二次重排** | 低置信度 / 单来源 recall 结果可被 evidence 内容匹配后提升 |

| **生存曲线驱动 STM 晋升** | STM→LTM 晋升使用聚类生存曲线（可用时） |

| **嵌入跨 topic 去重** | check_dedup 同时走 FTS + embedding 两路候选，捕捉跨 topic 语义重复 |

| **Session 分 chunk 提取** | 长会话按自然边界分割，跨 chunk 去重合并，不再截断丢失 |

| **上下文感知提取** | 提取前注入已有记忆，LLM 只输出增量知识 |

| **Topic 自动推断** | 规则 fallback 路径从关键词推断 topic 类别，替代 "auto-extracted" |

| **远程访问** | HTTP / SSE 传输，支持 bearer token 认证 |

### 安装

#### 从源码安装

```bash

git clone https://github.com/lyr1cs/rein.git

cd rein

# 标准构建（CLI + MCP 服务）

cargo install --path crates/rein --locked

# 完整构建（包含 Neural Wiki GUI，推荐）

cd crates/rein/gui && npm ci && npm run build && cd ../../..

cargo install --path crates/rein --locked --features gui

```

或使用安装脚本：

```bash

./scripts/install.sh

```

#### 前置条件

- Rust 工具链 (1.75+)

- Gemini API 密钥（免费额度：1500 请求/天）

#### GUI 服务管理

```bash

# 后台启动 GUI 服务（监听 :8680）

rein gui on

# 停止 GUI 服务

rein gui off

# 或前台运行 MCP + GUI

rein serve --gui

# 在浏览器打开

open http://localhost:8680

```

### 快速开始

```bash

# 1. 设置 API 密钥

export GEMINI_API_KEY="your-key-here"

# 2. 自动配置所有检测到的 MCP 客户端

rein init

# 3. 启动 MCP 服务（通常由客户端自动启动）

rein serve

```

### CLI 命令参考

| 命令 | 说明 | 示例 |

|------|------|------|

| `serve` | 启动 MCP 服务（stdio、SSE 或 proxy） | `rein serve [--compact] [--sse] [--proxy]` |

| `store` | 存储一条记忆 | `rein store -t debug -c "OOM fix" -I high -k oom,memory` |

| `recall` | 搜索记忆 | `rein recall "connection pool" -t debug -l 5` |

| `forget` | 按 ID 删除记忆 | `rein forget 01J...` |

| `update` | 更新记忆内容 | `rein update 01J... -c "new content" -I critical` |

| `topics` | 列出所有主题 | `rein topics` |

| `stats` | 显示存储统计 | `rein stats` |

| `health` | 检查主题健康状态 | `rein health [topic]` |

| `consolidate` | 将一个或多个主题批量合并为精简记忆 | `rein consolidate --pattern 'rmcp*' --merge-variants --dry-run` |

| `dedup` | 扫描/移除重复项，可跨 topic 变体处理 | `rein dedup [--dry-run] [--merge-variants]` |

| `cleanup` | 一键做 consolidate + dedup + adaptive refresh | `rein cleanup [topic] [--pattern 'rmcp*'] [--all] [--dry-run]` |

| `migrate` | 从 QMD 导入 / 重建索引 | `rein migrate [--from-qmd path] [--reindex]` |

| `init` | 自动配置 MCP 客户端 | `rein init [--dry-run]` |

| `config` | 显示当前配置 | `rein config` |

| `canonicals` | 查看 canonical memory 列表 | `rein canonicals [-l 20]` |

| `evidence` | 查看某个 canonical 的 evidence 快照 | `rein evidence  [-l 20]` |

| `dedup-log` | 查看最近的 dedup 决策日志 | `rein dedup-log [--canonical ID] [-l 20]` |

| `hook session-start` | 可选注入 Codex 项目记忆上下文 | `rein hook session-start` |

| `hook pre` | Codex PreToolUse deny-only guardrail | `rein hook pre` |

| `hook permission` | Codex PermissionRequest deny-only guardrail | `rein hook permission` |

| `hook post` | 从工具输出提取事实 | `rein hook post` |

| `hook compact` | 压缩前保存上下文 | `rein hook compact` |

| `hook prompt` | 可选注入 Codex UserPromptSubmit 相关记忆上下文 | `rein hook prompt` |

| `hook stop` | 会话结束时完整知识提取 | `rein hook stop` |

| `recent` | 显示最近记忆 | `rein recent [-l 20]` |

| `gc` | 垃圾回收弱 STM 记忆 | `rein gc [--dry-run]` |

| `organize` | 自动关联记忆 | `rein organize` |

| `upgrade` | 将旧记忆升级为知识图谱 | `rein upgrade [--topic X] [--dry-run]` |

| `resummerize` | LLM 驱动的 canonical 重压缩（v0.23） | `rein resummerize [--dry-run] [--canonical-id ID]` |

| `worker memory` | 清空异步记忆队列 | `rein worker memory` |

| `worker dedup-queue` | 清空 store 灰区 dedup 任务队列 | `rein worker dedup-queue` |

| `worker cleanup-queue` | 清空 cleanup 任务队列 | `rein worker cleanup-queue` |

| `dashboard` | 显示服务状态、指标、记忆统计 | `rein dashboard` |

| `gui on/off` | 后台启动/停止 GUI 服务 | `rein gui on` |

| `proxy on/off` | 后台启动/停止 proxy 服务 | `rein proxy on` |

### Cleanup 工作原理（保留溯源）

rein 的清理管线是**保留溯源**的：永远不会硬删除信息。流程分三个阶段：

1. **合并（Consolidation）** — 将 topic 变体（如 `Docker Deployment` / `docker-deployment`）归组，每组内所有记忆合并为一条高质量 canonical 记忆。原始记忆作为 evidence 保存到 `memory_evidence` 表，保留原始内容、时间戳和关键词。

2. **去重（Dedup）** — 在每个 topic 组内扫描内容级重复，使用词汇相似度（Jaccard + containment）和可选的嵌入余弦相似度。匹配的"输家"的独特内容被附加到"赢家"上（带溯源标记 `[merged from  on ]`），然后作为 evidence 记录。

3. **自适应刷新** — 合并和去重完成后，自适应引擎（M1-M6）运行：HDBSCAN 重聚类、生存曲线重建、层级边界更新、alpha/阈值学习处理新事件。

每次合并决策都记录在 `dedup_decisions` append-only 账本中，包含赢家/输家 ID、分数、关系类型、置信度和操作者。这是 rein 的 reflog — 你可以随时追溯一条 canonical 记忆是如何形成的。

```bash

# 预览清理效果（安全）

rein cleanup --all --dry-run

# 对特定 topic 清理

rein cleanup "docker-deployment"

# 全库清理

rein cleanup --all

# 通过 worker 入口执行清理

rein worker cleanup --all

```

`consolidate` 兼容旧用法 `rein consolidate  -s "summary"`，同时新增：

- `--topics a,b,c`：按显式 topic 列表批量处理

- `--pattern 'rmcp*'`：按 glob 批量匹配

- `--all`：处理所有 topic

- `--merge-variants`：先把大小写、空格、连字符、下划线等 topic 变体归并后再合并

- 不传 `--summary`：由 rein 自动生成 consolidated memory；有可用 LLM 时优先用 LLM，没有则回退到本地规则

批量 consolidate 会异步并行生成各 group 的 LLM summary/content，但 SQLite 写入仍按顺序事务提交。清理完成后还会写入 adaptive feedback，并刷新一轮 M1-M6 状态。

如果你想完全在 terminal 里自己跑全库清理：

- destructive 全库清理使用 `rein cleanup --all`

- `rein cleanup --dry-run` 先预览

- 后台式清理由 `rein worker cleanup ...`、`rein worker cleanup-queue` 和 cleanup queue worker 承担

store 热路径里的灰区 dedup 现在也会走专门异步队列：

- 新记忆先正常入库，不阻塞等待远程 LLM

- 后台 `dedup-queue` worker 再对灰区 pair 做结构化判定

- 需要手动消费时可运行 `rein worker dedup-queue`

可观测性命令：

- `rein canonicals` 查看 canonical memory 及其 support / merge 计数

- `rein evidence ` 查看被吸收的 evidence 快照

- `rein dedup-log` 查看最近的 dedup ledger

### MCP 工具

以 MCP 服务运行时（`rein serve`），Rein 通过 operation inventory 暴露 40 个 production MCP 工具。权威清单维护在 [docs/reference/mcp-tools.md](docs/reference/mcp-tools.md)，分为：

- 核心记忆：store、recall、update、forget、recent、topics、canonicals、evidence、stats、health。

- 维护：GC、dedup、concept dedup、organize、consolidate、cleanup、resummerize、archive summary refresh。

- 知识图谱与时序：memoir 工具、concept state、concept summary refresh、timeline、concept history。

- 自适应、会话、ARS 与 judge：feedback、adaptive status、session ingest、synthesis judge、concept-summary judge。

#### 知识图谱关系类型

`part_of`, `depends_on`, `related_to`, `contradicts`, `refines`, `alternative_to`, `caused_by`, `instance_of`, `superseded_by`

### LLM 提取层 (v0.3)

rein 使用 LLM（Gemini 3.1 Flash Lite 或本地模型）进行结构化记忆提取，自动构建知识图谱。

**架构：**

- `hook_post` — 本地模式提取（崩溃安全网）+ 缓冲到 session 文件

- `hook_compact` — 记录 compact 上下文，交给异步 memory worker 提炼

- `hook_stop` — 完整知识提取：记忆 + 概念 + 关系 + 会话摘要（异步 worker）

- `hook_session_start` / `hook_prompt` — 可选使用 Codex additionalContext 注入 Rein working surface

- `hook_pre_tool_use` / `hook_permission_request` — deny-only Codex guardrail，用于拦截明显危险的 shell 命令

**升级旧记忆：**

```bash

rein upgrade --dry-run    # 预览

rein upgrade              # 将旧记忆转为知识图谱

```

**配置：**

```toml

[extract]

provider = "google"    # 或 "omlx" 或 "none"

[extract.google]

model = "gemini-3.1-flash-lite-preview"

max_input_chars = 0    # 0 = 不截断（1M token 模型）

[extract.omlx]

endpoint = "http://localhost:11434/v1"  # Ollama, LM Studio, vLLM 等

model = "default"

max_input_chars = 16000

```

### 自学习质量系统 (v0.3.0)

rein 自动学习哪些记忆有用、哪些是噪声，无需人工调参。

**工作原理：**

1. LLM 在提取时给出 `quality_confidence` (0-1) — 零额外 API 成本

2. 系统追踪 recall → access 模式，分类"好记忆"（被使用）和"差记忆"（被召回但未使用）

3. 特征权重自动从数据学习：使用率、新颖度、连通度、时效性

4. 自适应入口阈值：近期质量低 → 收紧，高 → 放松

5. GC 清理质量低且被召回 5+ 次但从未使用的概念

**无需手动调参** — 冷启动用 LLM 判断，数据逐渐接管。

基于：ICLR 2026 Admission Control, PropMem (Prosus), FActScore, MACLA。

### 配置

rein 按以下优先级加载配置（高优先级覆盖低优先级）：

1. 环境变量

2. TOML 配置文件（`$REIN_CONFIG` 或 `~/.config/rein/config.toml`）

3. 编译时默认值

#### 环境变量

| 变量 | 说明 |

|------|------|

| `GEMINI_API_KEY` | Google Gemini API 密钥（用于嵌入） |

| `SUPERMEMORY_CC_API_KEY` | Supermemory API 密钥（用于交叉验证） |

| `REIN_HTTP_TOKEN` | 非 localhost HTTP/SSE 访问的 bearer token |

| `REIN_DB` | 覆盖数据库路径 |

| `REIN_CONFIG` | 覆盖配置文件路径 |

| `REIN_LOG` | 日志级别过滤（如 `debug`、`info`、`warn`） |

| `REIN_PROXY_BIND` | 覆盖 proxy 绑定地址 |

| `REIN_PROXY_PORT` | 覆盖 proxy 端口 |

| `REIN_SSE_BIND` | 覆盖 SSE/HTTP 绑定地址（默认 `127.0.0.1`） |

| `REIN_SSE_PORT` | 覆盖 SSE/HTTP 端口（默认 `8680`） |

| `REIN_PROXY_TOKEN` | 非 localhost proxy 的 bearer token |

#### config.toml

```toml

[database]

path = "auto"                          # "auto" = ~/.rein/memories.db

[embedding]

provider = "google"    # 或 "omlx" 或 "none"

dimensions = 3072

[embedding.google]

model = "gemini-embedding-001"

[embedding.omlx]

endpoint = "http://localhost:8000/v1"

model = "default"

[search]

rrf_k = 60.0

rrf_fts_weight = 0.3

rrf_vec_weight = 0.7

dedup_similarity = 0.70    # 使用 max(jaccard, containment) 相似度

dedup_time_window_days = 7

[chunking]

max_tokens = 512

overlap_percent = 10

metadata_prefix = true

[sync]

supermemory_enabled = true

auto_memory_enabled = true

auto_memory_glob = "~/.claude/projects/*/memory/**/*.md"

[decay]

base_lambda = 0.06

ltm_beta = 0.8

stm_beta = 1.2

interval_hours = 24

prune_threshold = 0.05

stm_to_ltm_access_count = 5

[server]

compact = false

sse_enabled = false

sse_port = 8680

sse_bind = "127.0.0.1"

```

### 数据库

数据库默认存储在 `~/.rein/memories.db`。rein 会自动从旧位置迁移数据。

可通过 `REIN_DB` 环境变量或 `[database] path` 配置项覆盖路径。

### Claude Code Hook 设置

在 Claude Code 的 `settings.json` 中添加以下内容以启用自动记忆提取：

```json

{

  "hooks": {

    "PostToolUse": [

      {

        "matcher": "",

        "hooks": [

          { "type": "command", "command": "rein hook post", "timeout": 10 }

        ]

      }

    ],

    "PreCompact": [

      {

        "matcher": "",

        "hooks": [

          { "type": "command", "command": "rein hook compact", "timeout": 10 }

        ]

      }

    ],

    "Stop": [

      {

        "matcher": "",

        "hooks": [

          { "type": "command", "command": "rein hook stop", "timeout": 30 }

        ]

      }

    ]

  }

}

```

**Hook 行为说明：**

- `PostToolUse` -- 本地模式提取（崩溃安全网）+ 缓冲到 session 文件

- `PreCompact` -- 记录重要上下文并交给异步 memory worker

- `Stop` -- 完整知识提取：记忆 + 概念 + 关系 + 会话摘要（通过异步 worker）

### Codex CLI Hook 设置

Codex CLI 需要启用 `hooks = true`（Codex 0.129+），并在 `~/.codex/hooks.json` 或

`~/.codex/config.toml` 的 `[hooks]` 表中声明 hook。`rein init` 会配置 Codex

MCP entry，并安装以下 hook：

- `SessionStart` -> `REIN_AGENT_LABEL=codex rein hook session-start`

- `PreToolUse` -> `REIN_AGENT_LABEL=codex rein hook pre`

- `PermissionRequest` -> `REIN_AGENT_LABEL=codex rein hook permission`

- `PostToolUse` -> `REIN_AGENT_LABEL=codex rein hook post`

- `UserPromptSubmit` -> `REIN_AGENT_LABEL=codex rein hook prompt`

- `Stop` -> `REIN_AGENT_LABEL=codex rein hook stop`

Codex 的 hook payload 和 Claude Code 不完全相同。Rein 会识别

`hook_event_name`、`tool_input`、`tool_response`、`prompt`、

`last_assistant_message` 和 `transcript_path`。其中 `PostToolUse` 和 `Stop`

接入同一套异步记忆管线；`PreToolUse` 和 `PermissionRequest` 是 deny-only

guardrail。`SessionStart` 与 `UserPromptSubmit` 可在显式启用后输出 Codex

官方 `additionalContext` JSON：

```toml

[hooks.codex]

inject_prompt_context = true

inject_session_context = true

max_additional_context_chars = 4000

```

### 通过 HTTP/SSE 远程访问

启动 SSE 传输以支持远程或多客户端访问：

```bash

rein serve --sse

```

默认绑定地址为 `127.0.0.1:8680`。

若要绑定到非 localhost 地址，**必须**设置 `REIN_HTTP_TOKEN` 环境变量以启用 bearer token 认证：

```bash

export REIN_HTTP_TOKEN="your-secret-token"

```

在 `config.toml` 中配置绑定地址和端口：

```toml

[server]

sse_enabled = true

sse_port = 8680

sse_bind = "0.0.0.0"    # 需要设置 REIN_HTTP_TOKEN

```

### 透明代理 (v0.10.0)

rein 可以作为透明 HTTP 代理运行，记录 LLM 对话而不修改请求。支持任何允许自定义 base URL 的 agent。

#### 快速开始

```bash

# 1. 启动代理（后台运行）

rein serve --proxy &

# 2. 配合你的 agent 使用

ANTHROPIC_BASE_URL=http://127.0.0.1:8690 claude       # Claude Code

codex -c 'model_providers.rein_proxy={ name = "Rein Proxy", base_url = "http://127.0.0.1:8690/v1", env_key = "OPENAI_API_KEY", wire_api = "responses", supports_websockets = false, env_http_headers = { "x-rein-token" = "REIN_PROXY_TOKEN" } }' -c 'model_provider="rein_proxy"'

```

#### Shell 别名（推荐）

添加到 `~/.zshrc` 或 `~/.bashrc`：

```bash

alias rein-proxy="rein serve --proxy &"

claudep() { REIN_PROXY_ACTIVE=1 ANTHROPIC_BASE_URL=http://127.0.0.1:8690 ANTHROPIC_CUSTOM_HEADERS="x-rein-token: ${REIN_PROXY_TOKEN:-}" claude "$@"; }

codexp() { REIN_PROXY_ACTIVE=1 codex -c 'model_providers.rein_proxy={ name = "Rein Proxy", base_url = "http://127.0.0.1:8690/v1", env_key = "OPENAI_API_KEY", wire_api = "responses", supports_websockets = false, env_http_headers = { "x-rein-token" = "REIN_PROXY_TOKEN" } }' -c 'model_provider="rein_proxy"' "$@"; }

codexsubp() { REIN_PROXY_ACTIVE=1 codex -c 'model_providers.rein_sub_proxy={ name = "Rein Subscription Proxy", base_url = "http://127.0.0.1:8690", requires_openai_auth = true, wire_api = "responses", supports_websockets = false }' -c 'model_provider="rein_sub_proxy"' -c 'chatgpt_base_url="http://127.0.0.1:8690/backend-api"' "$@"; }

codexsubpws() { REIN_PROXY_ACTIVE=1 codex -c 'model_providers.rein_sub_proxy_ws={ name = "Rein Subscription Proxy WS", base_url = "http://127.0.0.1:8690", requires_openai_auth = true, wire_api = "responses", supports_websockets = true }' -c 'model_provider="rein_sub_proxy_ws"' -c 'chatgpt_base_url="http://127.0.0.1:8690/backend-api"' "$@"; }

```

然后：`rein-proxy` 启动代理，`claudep`、`codexp`、`codexsubp` 或 `codexsubpws` 使用。对于 ChatGPT 登录的 Codex，`codexsubp` 仍然是推荐的 loopback 入口；回归 smoke 可以直接跑 `./scripts/smoke_codexsubp.sh`。如果要验证 websocket-first 路径，可以跑实验性的 `./scripts/smoke_codexsubp_ws.sh`。

`codexsubp` / `codexsubpws` 的 provider override 实际都由 `scripts/codexsubp_provider.toml.tmpl` 生成，这个模板是 `requires_openai_auth = true` 的唯一配置源。

#### Codex CLI 配置（替代方案）

也可以直接在 `~/.codex/config.toml` 中使用自定义 provider 永久配置：

```toml

[model_providers.rein_proxy]

name = "Rein Proxy"

base_url = "http://127.0.0.1:8690/v1"

env_key = "OPENAI_API_KEY"

wire_api = "responses"

supports_websockets = false

env_http_headers = { "x-rein-token" = "REIN_PROXY_TOKEN" }

model_provider = "rein_proxy"

```

这样所有 Codex 调用默认走 rein proxy（需先启动 proxy）。

#### 支持的 Agent

| Agent | 配置方式 | API 格式 |

|-------|---------|----------|

| **Claude Code** | `ANTHROPIC_BASE_URL=http://127.0.0.1:8690` | Anthropic `/v1/messages` |

| **Codex CLI** | `codexp` shell 函数或 `~/.codex/config.toml` 中自定义 `model_provider` | OpenAI `/responses` |

| **Codex CLI（ChatGPT 登录）** | `codexsubp` shell 函数，或 `./scripts/smoke_codexsubp.sh` 做 smoke | ChatGPT first-party（`/responses`、`/models`、`/responses/compact`、`/memories/trace_summarize`、`/wham/*`、`/connectors/*`） |

| **Codex CLI（ChatGPT 登录，实验性 WS-first）** | `codexsubpws` shell 函数，或 `./scripts/smoke_codexsubp_ws.sh` | 同一组 first-party 路径，但优先尝试 websocket，必要时依赖本地 `426` 回退 |

| **Cursor** | 设置 > Override OpenAI Base URL | OpenAI `/v1/chat/completions` |

| **Windsurf** | 设置 > Custom API Endpoint | OpenAI `/v1/chat/completions` |

| **任何 OpenAI 兼容工具** | `OPENAI_BASE_URL=http://127.0.0.1:8690` | OpenAI `/v1/chat/completions` |

> **注意：** Codex 订阅/OAuth 登录态 proxy 与上面的 API-key Responses API proxy 不是同一个实现。API-key Codex 继续走 `codexp`；ChatGPT 登录的 Codex 现在仍推荐走 `codexsubp`。这个入口会保留 `requires_openai_auth = true`，把 `chatgpt_base_url` 指向本地 rein proxy，并显式关闭 websocket 传输，让 first-party backend、helper/discovery 路径和 `/responses` 记录链路保持在 loopback 上。rein 现在也提供实验性的 `codexsubpws` / `smoke_codexsubp_ws.sh`，它保留 websocket 传输，并在上游 websocket 不可用时依赖本地 `426 Upgrade Required` 回退。后续重点是 hardening 与自动化，而不是补齐基础功能。

对于 loopback 场景下的 ChatGPT 登录 Codex，当前最实用的入口是 `codexsubp`。它使用一个 `requires_openai_auth = true` 的自定义 provider，这样仍然走 ChatGPT 登录态，但 provider 本身指向本地 rein proxy，并显式关闭 websocket 传输；同时把 `chatgpt_base_url` 也指向本地 proxy，让模型 API 和 helper/discovery 请求一起走 proxy。这条路径绕开了当前 upstream websocket 403/Cloudflare 问题，同时把订阅登录态固定在本地 record-only 路线上。非 `/responses` 的 first-party 路径保持 `artifact-mirror-only`，只做透明转发和原始 artifact 镜像，不做结构化提取。

即便客户端主动发起 websocket upgrade，rein 现在也只会对结构化文本的 `/responses` 路径升级；其它 first-party 路径会保持普通 HTTP，并继续沿用 `artifact-mirror-only` �
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lyr1cs/rein

Awesome Lists containing this project

README

中文