https://github.com/bolnet/agent-memory

Embedded memory for AI agents. SQLite + pgvector + Neo4j. Sub-5ms retrieval.
https://github.com/bolnet/agent-memory
agents ai claude-code llm mcp memory neo4j pgvector sqlite
Last synced: about 2 months ago
JSON representation
Embedded memory for AI agents. SQLite + pgvector + Neo4j. Sub-5ms retrieval.
Host: GitHub
URL: https://github.com/bolnet/agent-memory
Owner: bolnet
License: apache-2.0
Created: 2026-03-07T19:55:30.000Z (3 months ago)
Default Branch: main
Last Pushed: 2026-03-25T20:39:03.000Z (about 2 months ago)
Last Synced: 2026-03-26T03:03:45.852Z (about 2 months ago)
Topics: agents, ai, claude-code, llm, mcp, memory, neo4j, pgvector, sqlite
Language: Python
Homepage: https://bolnet.github.io/agent-memory/
Size: 2.53 MB
Stars: 2
Watchers: 0
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          


  

    

    

    

  





  Zero-config memory for AI agents. No Docker. No API keys. Just install and go.





  

  

  

  



---

## The Problem

AI agents forget everything between sessions. Every new conversation starts from zero — no memory of what you built yesterday, what decisions you made, or what your project even does.

Built-in memory solutions (like Claude Code's `MEMORY.md`) store flat files that load entirely into the context window every message. No search, no ranking, no contradiction handling. As your project grows, those files become a wall of text that burns tokens without helping.

## What Memwright Does

Memwright gives AI agents persistent, searchable memory that stays out of the context window until needed:

- **Ranked retrieval** — 3-layer search (tags + entity graph + vector similarity) returns only the most relevant memories

- **Token budgets** — Set a ceiling (e.g. 2,000 tokens). Memwright fits the best memories within that budget

- **Contradiction handling** — "User works at Google" automatically supersedes "User works at Meta"

- **Namespace isolation** — Multi-agent systems get isolated memory partitions per agent, user, or project

- **Zero config** — `poetry add memwright`, add one JSON block, done

---

## Table of Contents

- [Quick Start](#quick-start)

- [Architecture](#architecture)

- [How It Works](#how-it-works)

- [MCP Tools Reference](#mcp-tools-reference)

- [Retrieval Pipeline](#retrieval-pipeline)

- [Python API](#python-api)

- [Multi-Agent Support](#multi-agent-support)

- [Cloud Backends](#cloud-backends)

- [Cloud Deployment](#cloud-deployment)

- [Embedding Providers](#embedding-providers)

- [CLI Reference](#cli-reference)

- [Configuration](#configuration)

- [Testing](#testing)

- [Benchmarks](#benchmarks)

- [Compatibility](#compatibility)

- [Uninstall](#uninstall)

---

## Quick Start

### Step 1: Install

Choose one method. The package name is `memwright` on PyPI.

```bash

# Option A: uv (recommended on macOS)

uv tool install memwright

# Option B: pipx

pipx install memwright

# Option C: pip (in a venv or with --user)

pip install memwright

# Option D: poetry (add to an existing project)

poetry add memwright

```

> **First run downloads ~90MB** for the local embedding model (all-MiniLM-L6-v2). This happens once and is cached.

### Step 2: Connect to Claude Code

```bash

claude mcp add memory -- memwright mcp

```

Restart Claude Code. Approve the MCP server once. Done — Claude now has 8 memory tools.

**Alternative: manual MCP config.** Add to `~/.claude/.mcp.json` (global) or `.mcp.json` (per-project):

```json

{

  "mcpServers": {

    "memory": {

      "command": "memwright",

      "args": ["mcp"]

    }

  }

}

```

### Step 3: Verify

```bash

memwright doctor ~/.memwright

```

All 4 components should report healthy:

```

Overall: ALL HEALTHY

  [OK] SQLiteStore (0.2ms, 0 memories, 4,096 bytes)

  [OK] ChromaStore (0 vectors)

  [OK] NetworkXGraph (0 nodes, 0 edges)

  [OK] Retrieval Pipeline (3/3 layers)

```

Or ask Claude to call `memory_health` from within a session.

### Step 4 (optional): Enable lifecycle hooks

```bash

memwright init ~/.memwright --hooks

```

This auto-configures three Claude Code hooks in `~/.claude/settings.json`:

- **SessionStart** — injects relevant memories into context (20K token budget)

- **PostToolUse** — auto-captures file changes and command outputs

- **Stop** — generates a session summary

---

### Quick test from the CLI

```bash

# Add a memory

memwright add ~/.memwright "Project uses Python 3.12 with FastAPI" \

  --tags "python,fastapi" --category project

# Recall it

memwright recall ~/.memwright "what does the project use?"

# Search by category

memwright search ~/.memwright --category project

# Update a memory

memwright update ~/.memwright  "Project uses Python 3.13 with FastAPI" \

  --tags "python,fastapi"

# Check stats

memwright stats ~/.memwright

```

> **No API keys required.** Memwright uses a local embedding model — no HuggingFace token, no OpenAI key, no cloud account. The `HF_TOKEN` warning you may see in older versions is harmless noise and has been suppressed.

---

## Architecture



  



### Component Overview

```

agent_memory/

├── core.py                    # AgentMemory — main orchestrator

├── models.py                  # Memory + RetrievalResult dataclasses

├── context.py                 # AgentContext — multi-agent provenance & RBAC

├── client.py                  # MemoryClient — HTTP client for distributed mode

├── cli.py                     # CLI entry point (19 commands)

├── api.py                     # Starlette ASGI REST API (8 routes)

├── store/

│   ├── base.py                # Abstract interfaces: DocumentStore, VectorStore, GraphStore

│   ├── sqlite_store.py        # SQLite storage (WAL, 17 columns, 8 indexes)

│   ├── chroma_store.py        # ChromaDB vector search (local sentence-transformers)

│   ├── schema.sql             # SQLite schema definition

│   ├── postgres_backend.py    # PostgreSQL (pgvector + Apache AGE)

│   ├── arango_backend.py      # ArangoDB (native doc + vector + graph)

│   ├── aws_backend.py         # AWS (DynamoDB + OpenSearch + Neptune)

│   └── azure_backend.py       # Azure (Cosmos DB DiskANN + NetworkX)

├── graph/

│   ├── networkx_graph.py      # NetworkX MultiDiGraph with PageRank + BFS

│   └── extractor.py           # Entity/relation extraction (50+ known tools)

├── retrieval/

│   ├── orchestrator.py        # 3-layer cascade with RRF fusion

│   ├── tag_matcher.py         # Stop-word filtered tag extraction

│   └── scorer.py              # Temporal, entity, PageRank, MMR, confidence decay

├── temporal/

│   └── manager.py             # Contradiction detection + supersession

├── extraction/

│   └── extractor.py           # Rule-based + LLM memory extraction

├── mcp/

│   └── server.py              # MCP server (8 tools, 2 resources, 2 prompts)

├── hooks/

│   ├── session_start.py       # Context injection (20K token budget)

│   ├── post_tool_use.py       # Auto-capture from Write/Edit/Bash

│   └── stop.py                # Session summary generation

├── utils/

│   └── config.py              # MemoryConfig dataclass + load/save

└── infra/                     # Terraform + Docker for cloud deployments

    ├── apprunner/             # AWS App Runner

    ├── cloudrun/              # GCP Cloud Run

    └── containerapp/          # Azure Container Apps

```

### Three Storage Roles

Every backend implements one or more of these roles:

| Role | Purpose | Local Default | Cloud Options |

|------|---------|--------------|---------------|

| **Document** | Core storage, CRUD, filtering | SQLite | PostgreSQL, ArangoDB, DynamoDB, Cosmos DB |

| **Vector** | Semantic similarity search | ChromaDB | pgvector, ArangoDB, OpenSearch, Cosmos DiskANN |

| **Graph** | Entity relationships, BFS traversal | NetworkX | Apache AGE, ArangoDB, Neptune |

Cloud backends fill all 3 roles in a single service. If any optional component fails, the system degrades gracefully to document-only.

---

## How It Works

### Memory lives outside the context window

This is the key difference. Flat-file memory loads everything into context every message. Memwright stores memories in a separate process (SQLite + ChromaDB + NetworkX on disk). The context window never sees them until the agent explicitly asks.

```

Flat-file memory:                    Memwright:

┌──────────────────────────┐        ┌──────────────────────────┐

│  Context Window          │        │  Context Window          │

│                          │        │                          │

│  System prompt           │        │  System prompt           │

│  MEMORY.md ← ALL of it  │        │  User message            │

│  grows forever           │        │  memory_recall → 2K max  │

│  User message            │        │                          │

└──────────────────────────┘        └──────────────────────────┘

                                    ┌──────────────────────────┐

                                    │  Memwright (on disk)     │

                                    │  10,000+ memories        │

                                    │  ← never in context     │

                                    └──────────────────────────┘

```

### Token cost stays flat as memory grows

```

Flat-file approach:

  Month 1:   2K tokens loaded every message

  Month 6:  15K tokens loaded every message  ← context crowded

Memwright approach:

  Month 1:   2K tokens max when recalled (ranking from 100 memories)

  Month 6:   2K tokens max when recalled (ranking from 5,000 memories)

                                             ← same cost, better results

```

More stored memories makes retrieval *better* — more candidates to rank — while context cost stays constant.

### How a recall works

When an agent calls `memory_recall("deployment setup", budget=2000)`:

```

Store: 5,000 memories

  Tag search finds:     15 memories tagged "deployment"

  Graph search finds:    8 memories linked to "AWS", "Docker" entities

  Vector search finds:  20 semantically similar memories

  After dedup + RRF fusion:  30 unique candidates, scored and ranked

  Budget fitting (2,000 tokens):

    Memory A (score 0.95):  500 tokens → in   (total: 500)

    Memory B (score 0.90):  600 tokens → in   (total: 1,100)

    Memory C (score 0.88):  400 tokens → in   (total: 1,500)

    Memory D (score 0.85):  300 tokens → in   (total: 1,800)

    Memory E (score 0.80):  400 tokens → SKIP (exceeds 2,000)

  Result: 4 memories, 1,800 tokens. 4,996 memories never entered context.

```

---

## MCP Tools Reference

Once the MCP server is running, agents have these tools:

| Tool | Purpose | Key Parameters |

|------|---------|----------------|

| `memory_add` | Store a fact | `content`, `tags[]`, `category`, `entity`, `namespace`, `event_date`, `confidence` |

| `memory_recall` | Smart multi-layer retrieval | `query`, `budget` (default: 16000), `namespace` |

| `memory_search` | Filter with date ranges | `query`, `category`, `entity`, `namespace`, `status`, `after`, `before`, `limit` |

| `memory_get` | Fetch by ID | `memory_id` |

| `memory_forget` | Archive (soft delete) | `memory_id` |

| `memory_timeline` | Chronological entity history | `entity`, `namespace` |

| `memory_stats` | Store size, counts | — |

| `memory_health` | Health check (call first!) | — |

### Categories

`core_belief` · `preference` · `career` · `project` · `technical` · `personal` · `location` · `relationship` · `event` · `session` · `general`

### MCP Resources

- **`memwright://entity/{name}`** — Entity details + related entities from graph

- **`memwright://memory/{id}`** — Full memory object

### MCP Prompts

- **`recall`** — Search memories for relevant context

- **`timeline`** — Chronological history of an entity

---

## Retrieval Pipeline

The retrieval system uses a 3-layer cascade with multi-signal fusion:

```

Query: "deployment setup"

  │

  ├─ Layer 0: Graph Expansion

  │  Extract entities from query → BFS traversal (depth=2)

  │  "deployment" → finds "AWS", "Docker", "Terraform" connections

  │

  ├─ Layer 1: Tag Match (SQLite)

  │  extract_tags(query) → tag_search() → score 1.0

  │

  ├─ Layer 2: Entity-Field Search

  │  Memories about graph-connected entities → score 0.5

  │

  ├─ Layer 3: Vector Search (ChromaDB)

  │  Semantic similarity → score = 1 - cosine_distance

  │

  ├─ Layer 4: Graph Relation Triples

  │  Inject relationship context → score 0.6

  │

  ▼ FUSION

  ├─ Reciprocal Rank Fusion (RRF, k=60)

  │  score = Σ 1/(k + rank_in_source)

  │  OR Graph Blend: 0.7 * norm_vector + 0.3 * norm_pagerank

  │

  ▼ SCORING

  ├─ Temporal Boost: +0.2 * max(0, 1 - age_days/90)

  ├─ Entity Boost:   +0.30 exact match, +0.15 substring

  ├─ PageRank Boost:  +0.3 * entity_pagerank_score

  │

  ▼ DIVERSITY

  ├─ MMR Rerank: λ*relevance - (1-λ)*max_jaccard_similarity (λ=0.7)

  │

  ▼ CONFIDENCE

  ├─ Time Decay:    -0.001 per hour since last access

  ├─ Access Boost:  +0.03 per access_count

  ├─ Clamp:         [0.1, 1.0]

  │

  ▼ BUDGET

  └─ Greedy selection by score until token budget filled

```

Querying "Python" also finds memories about "FastAPI" if they're connected in the entity graph. Multi-hop reasoning through relationship traversal.

---

## Python API

### Basic Usage

```python

from agent_memory import AgentMemory

mem = AgentMemory("./my-agent")  # auto-provisions all backends

# Store

mem.add("User prefers Python over Java",

        tags=["preference", "coding"],

        category="preference",

        entity="Python")

# Recall with token budget

results = mem.recall("what language?", budget=2000)

# Formatted context for prompt injection

context = mem.recall_as_context("user background", budget=4000)

# Search with filters

memories = mem.search(category="project", entity="Python", limit=10)

# Timeline

history = mem.timeline("Python")

# Contradiction handling — automatic

mem.add("User works at Google", tags=["career"], category="career", entity="Google")

mem.add("User works at Meta", tags=["career"], category="career", entity="Meta")

# ^ Google memory auto-superseded

# Namespace isolation

mem.add("Team standup at 9am", namespace="team:alpha")

results = mem.recall("standup time", namespace="team:alpha")

# Maintenance

mem.forget(memory_id)             # Archive

mem.forget_before("2025-01-01")   # Archive old memories

mem.compact()                     # Permanently delete archived

mem.export_json("backup.json")    # Export

mem.import_json("backup.json")    # Import (dedup by content hash)

# Health & stats

mem.health()  # → {sqlite: ok, chroma: ok, networkx: ok, retrieval: ok}

mem.stats()   # → {total: 500, active: 480, ...}

# Context manager

with AgentMemory("./store") as mem:

    mem.add("auto-closed on exit")

```

### Memory Object

```python

@dataclass

class Memory:

    id: str                    # UUID

    content: str               # The actual fact/observation

    tags: List[str]            # Searchable tags

    category: str              # Classification (preference, career, project, ...)

    entity: str                # Primary entity (company, tool, person)

    namespace: str             # Isolation key (default: "default")

    created_at: str            # ISO timestamp

    event_date: str            # When the fact occurred

    valid_from: str            # Temporal validity start

    valid_until: str           # Set when superseded

    superseded_by: str         # ID of replacement memory

    confidence: float          # 0.0-1.0

    status: str                # active | superseded | archived

    access_count: int          # Times recalled

    last_accessed: str         # Last recall timestamp

    content_hash: str          # SHA-256 for dedup

    metadata: Dict[str, Any]   # Arbitrary JSON

```

---

## Multi-Agent Support



  



For multi-agent pipelines with provenance tracking, RBAC, and governance:

```python

from agent_memory.context import AgentContext, AgentRole, Visibility

# Create a root context

ctx = AgentContext.from_env(

    agent_id="orchestrator",

    namespace="project:acme",

    role=AgentRole.ORCHESTRATOR,

    token_budget=20000,

)

# Spawn child contexts for sub-agents (immutable — returns new instance)

planner = ctx.as_agent("planner", role=AgentRole.PLANNER, token_budget=5000)

researcher = ctx.as_agent("researcher", role=AgentRole.RESEARCHER, read_only=True)

# Provenance tracking — metadata auto-enriched

planner.add_memory("Architecture decision: use event sourcing",

                   category="technical", visibility=Visibility.TEAM)

# metadata includes: _agent_id, _session_id, _namespace, _visibility, _role

# Recall is scoped to namespace + cached within session

results = researcher.recall("architecture decisions")

# Token budget tracked

print(researcher.token_budget - researcher.token_budget_used)

# Governance

researcher.flag_for_review("Need human approval for deployment plan")

researcher.add_compliance_tag("SOC2")

# Session introspection

summary = ctx.session_summary()

# → {agent_trail, memories_written, memories_recalled, token_usage, review_flags}

```

### AgentContext Features

| Feature | Description |

|---------|-------------|

| **Namespace isolation** | Each agent/project gets isolated memory partition |

| **RBAC roles** | ORCHESTRATOR, PLANNER, EXECUTOR, RESEARCHER, REVIEWER, MONITOR |

| **Read-only mode** | Agents can recall but not write |

| **Write quotas** | `max_writes_per_agent` (default: 100) |

| **Token budgets** | Per-agent budget tracking |

| **Recall cache** | Dedup redundant queries within a session |

| **Scratchpad** | Inter-agent data passing |

| **Provenance** | Agent trail, parent tracking, visibility levels |

| **Compliance** | Review flags, compliance tags for audit |

| **Distributed mode** | Set `memory_url` to use HTTP client instead of local |

---

## Cloud Backends

Each cloud backend fills all three roles (document, vector, graph) in a single service:

### PostgreSQL (Neon, Cloud SQL, self-hosted)

Uses pgvector for vectors, Apache AGE for graph. AGE is optional — without it, graph gracefully degrades.

```python

mem = AgentMemory("./store", config={

    "backends": ["postgres"],

    "postgres": {"url": "postgresql://user:pass@host:5432/memwright"}

})

```

### ArangoDB (ArangoGraph Cloud, Docker)

Native document, vector, and graph support in one database.

```python

mem = AgentMemory("./store", config={

    "backends": ["arangodb"],

    "arangodb": {"url": "https://instance.arangodb.cloud:8529", "database": "memwright"}

})

```

### Azure (Cosmos DB)

Cosmos DB with DiskANN vector indexing. Graph via NetworkX persisted to Cosmos containers.

```python

mem = AgentMemory("./store", config={

    "backends": ["azure"],

    "azure": {"cosmos_endpoint": "https://account.documents.azure.com:443/"}

})

```

### GCP (AlloyDB)

Extends PostgreSQL backend with AlloyDB Connector (IAM auth) and Vertex AI embeddings (768D).

```python

mem = AgentMemory("./store", config={

    "backends": ["gcp"],

    "gcp": {"project_id": "my-project", "cluster": "memwright", "instance": "primary"}

})

```

### Installing cloud extras

```bash

poetry add "memwright[postgres]"    # PostgreSQL

poetry add "memwright[arangodb]"    # ArangoDB

poetry add "memwright[aws]"         # AWS (DynamoDB + OpenSearch + Neptune)

poetry add "memwright[azure]"       # Azure Cosmos DB

poetry add "memwright[gcp]"         # GCP AlloyDB + Vertex AI

poetry add "memwright[all]"         # Everything

```

---

## Cloud Deployment

Deploy Memwright as an HTTP API on any cloud with a single command:

```bash

./scripts/deploy.sh aws        # App Runner (2 CPU / 4GB, auto-scale)

./scripts/deploy.sh gcp        # Cloud Run (auto-scale 0–3, 2 CPU / 4GB)

./scripts/deploy.sh azure      # Container Apps (scale-to-zero, 2 CPU / 4GB)

./scripts/deploy.sh aws --teardown   # Destroy everything

```

**Prerequisites**: Docker, Terraform, cloud CLI (`aws`/`gcloud`/`az`), backend credentials in `.env`.

| Cloud | Infrastructure | Terraform |

|-------|---------------|-----------|

| AWS | ECR + App Runner (2 CPU, 4GB) | `agent_memory/infra/apprunner/main.tf` |

| GCP | Artifact Registry + Cloud Run (2 CPU, 4GB) | `agent_memory/infra/cloudrun/main.tf` |

| Azure | ACR + Log Analytics + Container Apps (2 CPU, 4GB) | `agent_memory/infra/containerapp/main.tf` |

### REST API Endpoints

All deployments expose the same Starlette ASGI API:

| Method | Endpoint | Description |

|--------|----------|-------------|

| `GET` | `/health` | Component health check |

| `GET` | `/stats` | Store statistics |

| `POST` | `/add` | Add a memory |

| `POST` | `/recall` | Smart retrieval with budget |

| `POST` | `/search` | Filtered search |

| `POST` | `/timeline` | Entity chronological history |

| `POST` | `/forget` | Archive a memory |

| `GET` | `/memory/{id}` | Get memory by ID |

Response envelope: `{"ok": true, "data": {...}}` or `{"ok": false, "error": "message"}`

---

## Embedding Providers

Memwright auto-detects the best available embedding provider:

| Priority | Provider | Model | Dimensions | Trigger |

|----------|----------|-------|------------|---------|

| 1 | Cloud-native | Bedrock Titan / Azure OpenAI / Vertex AI | 768-1536 | Cloud backend configured |

| 2 | OpenAI / OpenRouter | text-embedding-3-small | 1536 | `OPENAI_API_KEY` or `OPENROUTER_API_KEY` set |

| 3 | Local (default) | all-MiniLM-L6-v2 | 384 | Always available, no API key |

The local fallback downloads ~90MB on first use. All providers implement the same interface — switching is transparent.

---

## CLI Reference

Both `memwright` and `agent-memory` work as entry points:

### MCP Server

```bash

memwright mcp                          # Start MCP server (uses ~/.memwright)

memwright mcp --path /custom/path      # Custom store location

```

### Memory Operations

```bash

agent-memory add ./store "User prefers Python" --tags "pref,coding" --category preference --namespace default

agent-memory recall ./store "what language?" --budget 4000 --namespace default

agent-memory search ./store --category project --entity Python --namespace default --limit 20

agent-memory list ./store --status active --category technical --namespace default

agent-memory timeline ./store --entity Python --namespace default

agent-memory update ./store  "Updated content" --tags "new,tags" --category technical

agent-memory forget ./store 

```

### Maintenance

```bash

agent-memory doctor ~/.memwright       # Health check (SQLite, ChromaDB, NetworkX, Retrieval)

agent-memory stats ./store             # Memory counts, DB size, breakdowns

agent-memory export ./store -o backup.json

agent-memory import ./store backup.json

agent-memory compact ./store           # Permanently delete archived memories

agent-memory inspect ./store           # Raw DB inspection

```

### Lifecycle Hooks (Claude Code)

```bash

memwright hook session-start           # Inject context at session start

memwright hook post-tool-use           # Auto-capture tool observations

memwright hook stop                    # Generate session summary

```

### Benchmarks

```bash

agent-memory locomo --max-conversations 5 --verbose

agent-memory mab --categories AR,CR --max-examples 10

```

---

## Configuration

### Store location

Default: `~/.memwright/`. Configurable with `--path` on any CLI command.

```

~/.memwright/

├── memory.db        # SQLite database (core storage)

├── config.json      # Retrieval tuning parameters

├── graph.json       # NetworkX entity graph

└── chroma/          # ChromaDB vector store + embeddings

```

### config.json

All fields optional. Defaults apply if the file doesn't exist:

```json

{

  "default_token_budget": 16000,

  "min_results": 3,

  "backends": ["sqlite", "chroma", "networkx"],

  "enable_mmr": true,

  "mmr_lambda": 0.7,

  "fusion_mode": "rrf",

  "confidence_gate": 0.0,

  "confidence_decay_rate": 0.001,

  "confidence_boost_rate": 0.03

}

```

| Parameter | Default | Description |

|-----------|---------|-------------|

| `default_token_budget` | 16000 | Max tokens returned per recall (start high, lower to tune) |

| `min_results` | 3 | Minimum results to return |

| `enable_mmr` | true | Maximal Marginal Relevance diversity reranking |

| `mmr_lambda` | 0.7 | Relevance vs diversity balance (0=diverse, 1=relevant) |

| `fusion_mode` | "rrf" | "rrf" (parameter-free) or "graph_blend" (weighted) |

| `confidence_decay_rate` | 0.001 | Score penalty per hour since last access |

| `confidence_boost_rate` | 0.03 | Score boost per access count |

| `confidence_gate` | 0.0 | Minimum confidence threshold to include in results |

### Environment Variables

| Variable | Purpose |

|----------|---------|

| `MEMWRIGHT_PATH` | Default store path |

| `MEMWRIGHT_URL` | Remote API URL (distributed mode) |

| `MEMWRIGHT_NAMESPACE` | Default namespace |

| `MEMWRIGHT_TOKEN_BUDGET` | Default token budget |

| `MEMWRIGHT_SESSION_ID` | Session ID for provenance tracking |

---

## Testing

### Running Tests

```bash

# All unit tests — no Docker, no API keys

poetry run pytest tests/ -v

# With coverage

poetry run pytest tests/ -v --cov=agent_memory --cov-report=term-missing

# Live integration tests (need credentials)

NEON_DATABASE_URL='postgresql://...' poetry run pytest tests/test_postgres_live.py -v

AZURE_COSMOS_ENDPOINT='https://...' poetry run pytest tests/test_azure_live.py -v

```

### Test Coverage

- **607 unit tests** covering all backends, retrieval, config, embeddings, and CLI

- **14 live integration tests** per cloud backend (Neon, Azure, ArangoDB)

- **Mock tests** for every cloud backend — no cloud account needed

- All unit tests run without Docker or API keys

---

## Benchmarks

### Latency (P50 recall — the core operation)

| Backend | Stack | P50 | P95 | P99 |

|---|---|---|---|---|

| **PG + pgvector + AGE (Docker)** | PostgreSQL 16 + pgvector + Apache AGE | **1.4ms** | **5.5ms** | **39ms** |

| SQLite + ChromaDB + NetworkX (local) | SQLite 3 + ChromaDB 1.x + NetworkX 3 | 9.1ms | 31ms | 75ms |

| ArangoDB (Docker) | ArangoDB 3.12 (doc + vector + graph) | 40ms | 57ms | 68ms |

| GCP Cloud Run (us-central1) | Starlette + Uvicorn → ArangoDB Oasis | 156ms | 245ms | 271ms |

| Azure Container Apps (eastus) | Starlette + Uvicorn → ArangoDB Oasis | 293ms | 466ms | 480ms |

| AWS App Runner (us-west-2) | Starlette + Uvicorn → ArangoDB Oasis | 621ms | 792ms | 813ms |

### vs. Competitors (recall P50)

| System | Stack | P50 | Notes |

|---|---|---|---|

| **Memwright (PG Docker)** | PG 16 + pgvector + AGE | **1.4ms** | Full 3-layer pipeline, 81.2% LOCOMO |

| Ruflo | In-process HNSW | 2-3ms | Vector lookup only, not full retrieval |

| **Memwright (local)** | SQLite + ChromaDB + NX | **9.1ms** | Zero-config, no Docker, no API keys |

| **Memwright (GCP Cloud Run)** | Starlette → ArangoDB Oasis | **156ms** | Full cloud API, scale-to-zero |

| Mem0 | Cloud + LLM judge | 200ms | LLM in retrieval path |

| Zep | Neo4j + embeddings | <200ms | P95 ~632ms under concurrency |

| Mem0 Graph | Cloud + LLM + graph | 660ms | Graph variant, much slower |

Full results with add/search latency: [docs/LATENCY_BENCHMARKS.md](docs/LATENCY_BENCHMARKS.md)

### LOCOMO (Long Conversation Memory)

| System | Accuracy |

|--------|----------|

| MemMachine | 84.9% |

| **Memwright** | **81.2%** |

| Zep | ~75% |

| Letta | 74.0% |

| Mem0 (Graph) | 66.9% |

| OpenAI Memory | 52.9% |

*Scores are self-reported across vendors. [Methodology is disputed](https://blog.getzep.com/lies-damn-lies-statistics-is-mem0-really-sota-in-agent-memory/).*

Retrieval is fully local — tag matching, graph traversal, vector search with RRF fusion. No LLM re-ranking. Only benchmark answer synthesis uses an LLM.

---

## Compatibility

### MCP Clients

| Client | Config File |

|--------|-------------|

| Claude Code | `.mcp.json` (project) or `~/.claude/.mcp.json` (global) |

| Cursor | `.cursor/mcp.json` |

| Windsurf | MCP config in settings |

| Any MCP client | Standard MCP stdio transport |

Same `memwright mcp` command. Same zero-config setup.

### Python

- Python 3.10, 3.11, 3.12, 3.13, 3.14

---

## Uninstall

### 1. Remove MCP server config

Delete the `memory` entry from `~/.claude/.mcp.json` (global) or `.mcp.json` (per-project).

### 2. Uninstall the package

```bash

# Match your install method:

uv tool uninstall memwright    # if installed with uv

pipx uninstall memwright       # if installed with pipx

pip uninstall memwright        # if installed with pip

poetry remove memwright        # if installed with poetry

```

### 3. Delete stored memories (optional)

```bash

# Export first if you want a backup

agent-memory export ~/.memwright -o memwright-backup.json

# Then delete

rm -rf ~/.memwright

```

---

## License

Apache 2.0

---

_{mcp-name: io.github.bolnet/memwright}
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bolnet/agent-memory

Awesome Lists containing this project

README