https://github.com/bolnet/agent-memory
Embedded memory for AI agents. SQLite + pgvector + Neo4j. Sub-5ms retrieval.
https://github.com/bolnet/agent-memory
agents ai claude-code llm mcp memory neo4j pgvector sqlite
Last synced: about 2 months ago
JSON representation
Embedded memory for AI agents. SQLite + pgvector + Neo4j. Sub-5ms retrieval.
- Host: GitHub
- URL: https://github.com/bolnet/agent-memory
- Owner: bolnet
- License: apache-2.0
- Created: 2026-03-07T19:55:30.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-03-25T20:39:03.000Z (about 2 months ago)
- Last Synced: 2026-03-26T03:03:45.852Z (about 2 months ago)
- Topics: agents, ai, claude-code, llm, mcp, memory, neo4j, pgvector, sqlite
- Language: Python
- Homepage: https://bolnet.github.io/agent-memory/
- Size: 2.53 MB
- Stars: 2
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Zero-config memory for AI agents. No Docker. No API keys. Just install and go.
---
## The Problem
AI agents forget everything between sessions. Every new conversation starts from zero — no memory of what you built yesterday, what decisions you made, or what your project even does.
Built-in memory solutions (like Claude Code's `MEMORY.md`) store flat files that load entirely into the context window every message. No search, no ranking, no contradiction handling. As your project grows, those files become a wall of text that burns tokens without helping.
## What Memwright Does
Memwright gives AI agents persistent, searchable memory that stays out of the context window until needed:
- **Ranked retrieval** — 3-layer search (tags + entity graph + vector similarity) returns only the most relevant memories
- **Token budgets** — Set a ceiling (e.g. 2,000 tokens). Memwright fits the best memories within that budget
- **Contradiction handling** — "User works at Google" automatically supersedes "User works at Meta"
- **Namespace isolation** — Multi-agent systems get isolated memory partitions per agent, user, or project
- **Zero config** — `poetry add memwright`, add one JSON block, done
---
## Table of Contents
- [Quick Start](#quick-start)
- [Architecture](#architecture)
- [How It Works](#how-it-works)
- [MCP Tools Reference](#mcp-tools-reference)
- [Retrieval Pipeline](#retrieval-pipeline)
- [Python API](#python-api)
- [Multi-Agent Support](#multi-agent-support)
- [Cloud Backends](#cloud-backends)
- [Cloud Deployment](#cloud-deployment)
- [Embedding Providers](#embedding-providers)
- [CLI Reference](#cli-reference)
- [Configuration](#configuration)
- [Testing](#testing)
- [Benchmarks](#benchmarks)
- [Compatibility](#compatibility)
- [Uninstall](#uninstall)
---
## Quick Start
### Step 1: Install
Choose one method. The package name is `memwright` on PyPI.
```bash
# Option A: uv (recommended on macOS)
uv tool install memwright
# Option B: pipx
pipx install memwright
# Option C: pip (in a venv or with --user)
pip install memwright
# Option D: poetry (add to an existing project)
poetry add memwright
```
> **First run downloads ~90MB** for the local embedding model (all-MiniLM-L6-v2). This happens once and is cached.
### Step 2: Connect to Claude Code
```bash
claude mcp add memory -- memwright mcp
```
Restart Claude Code. Approve the MCP server once. Done — Claude now has 8 memory tools.
**Alternative: manual MCP config.** Add to `~/.claude/.mcp.json` (global) or `.mcp.json` (per-project):
```json
{
"mcpServers": {
"memory": {
"command": "memwright",
"args": ["mcp"]
}
}
}
```
### Step 3: Verify
```bash
memwright doctor ~/.memwright
```
All 4 components should report healthy:
```
Overall: ALL HEALTHY
[OK] SQLiteStore (0.2ms, 0 memories, 4,096 bytes)
[OK] ChromaStore (0 vectors)
[OK] NetworkXGraph (0 nodes, 0 edges)
[OK] Retrieval Pipeline (3/3 layers)
```
Or ask Claude to call `memory_health` from within a session.
### Step 4 (optional): Enable lifecycle hooks
```bash
memwright init ~/.memwright --hooks
```
This auto-configures three Claude Code hooks in `~/.claude/settings.json`:
- **SessionStart** — injects relevant memories into context (20K token budget)
- **PostToolUse** — auto-captures file changes and command outputs
- **Stop** — generates a session summary
---
### Quick test from the CLI
```bash
# Add a memory
memwright add ~/.memwright "Project uses Python 3.12 with FastAPI" \
--tags "python,fastapi" --category project
# Recall it
memwright recall ~/.memwright "what does the project use?"
# Search by category
memwright search ~/.memwright --category project
# Update a memory
memwright update ~/.memwright "Project uses Python 3.13 with FastAPI" \
--tags "python,fastapi"
# Check stats
memwright stats ~/.memwright
```
> **No API keys required.** Memwright uses a local embedding model — no HuggingFace token, no OpenAI key, no cloud account. The `HF_TOKEN` warning you may see in older versions is harmless noise and has been suppressed.
---
## Architecture
### Component Overview
```
agent_memory/
├── core.py # AgentMemory — main orchestrator
├── models.py # Memory + RetrievalResult dataclasses
├── context.py # AgentContext — multi-agent provenance & RBAC
├── client.py # MemoryClient — HTTP client for distributed mode
├── cli.py # CLI entry point (19 commands)
├── api.py # Starlette ASGI REST API (8 routes)
├── store/
│ ├── base.py # Abstract interfaces: DocumentStore, VectorStore, GraphStore
│ ├── sqlite_store.py # SQLite storage (WAL, 17 columns, 8 indexes)
│ ├── chroma_store.py # ChromaDB vector search (local sentence-transformers)
│ ├── schema.sql # SQLite schema definition
│ ├── postgres_backend.py # PostgreSQL (pgvector + Apache AGE)
│ ├── arango_backend.py # ArangoDB (native doc + vector + graph)
│ ├── aws_backend.py # AWS (DynamoDB + OpenSearch + Neptune)
│ └── azure_backend.py # Azure (Cosmos DB DiskANN + NetworkX)
├── graph/
│ ├── networkx_graph.py # NetworkX MultiDiGraph with PageRank + BFS
│ └── extractor.py # Entity/relation extraction (50+ known tools)
├── retrieval/
│ ├── orchestrator.py # 3-layer cascade with RRF fusion
│ ├── tag_matcher.py # Stop-word filtered tag extraction
│ └── scorer.py # Temporal, entity, PageRank, MMR, confidence decay
├── temporal/
│ └── manager.py # Contradiction detection + supersession
├── extraction/
│ └── extractor.py # Rule-based + LLM memory extraction
├── mcp/
│ └── server.py # MCP server (8 tools, 2 resources, 2 prompts)
├── hooks/
│ ├── session_start.py # Context injection (20K token budget)
│ ├── post_tool_use.py # Auto-capture from Write/Edit/Bash
│ └── stop.py # Session summary generation
├── utils/
│ └── config.py # MemoryConfig dataclass + load/save
└── infra/ # Terraform + Docker for cloud deployments
├── apprunner/ # AWS App Runner
├── cloudrun/ # GCP Cloud Run
└── containerapp/ # Azure Container Apps
```
### Three Storage Roles
Every backend implements one or more of these roles:
| Role | Purpose | Local Default | Cloud Options |
|------|---------|--------------|---------------|
| **Document** | Core storage, CRUD, filtering | SQLite | PostgreSQL, ArangoDB, DynamoDB, Cosmos DB |
| **Vector** | Semantic similarity search | ChromaDB | pgvector, ArangoDB, OpenSearch, Cosmos DiskANN |
| **Graph** | Entity relationships, BFS traversal | NetworkX | Apache AGE, ArangoDB, Neptune |
Cloud backends fill all 3 roles in a single service. If any optional component fails, the system degrades gracefully to document-only.
---
## How It Works
### Memory lives outside the context window
This is the key difference. Flat-file memory loads everything into context every message. Memwright stores memories in a separate process (SQLite + ChromaDB + NetworkX on disk). The context window never sees them until the agent explicitly asks.
```
Flat-file memory: Memwright:
┌──────────────────────────┐ ┌──────────────────────────┐
│ Context Window │ │ Context Window │
│ │ │ │
│ System prompt │ │ System prompt │
│ MEMORY.md ← ALL of it │ │ User message │
│ grows forever │ │ memory_recall → 2K max │
│ User message │ │ │
└──────────────────────────┘ └──────────────────────────┘
┌──────────────────────────┐
│ Memwright (on disk) │
│ 10,000+ memories │
│ ← never in context │
└──────────────────────────┘
```
### Token cost stays flat as memory grows
```
Flat-file approach:
Month 1: 2K tokens loaded every message
Month 6: 15K tokens loaded every message ← context crowded
Memwright approach:
Month 1: 2K tokens max when recalled (ranking from 100 memories)
Month 6: 2K tokens max when recalled (ranking from 5,000 memories)
← same cost, better results
```
More stored memories makes retrieval *better* — more candidates to rank — while context cost stays constant.
### How a recall works
When an agent calls `memory_recall("deployment setup", budget=2000)`:
```
Store: 5,000 memories
Tag search finds: 15 memories tagged "deployment"
Graph search finds: 8 memories linked to "AWS", "Docker" entities
Vector search finds: 20 semantically similar memories
After dedup + RRF fusion: 30 unique candidates, scored and ranked
Budget fitting (2,000 tokens):
Memory A (score 0.95): 500 tokens → in (total: 500)
Memory B (score 0.90): 600 tokens → in (total: 1,100)
Memory C (score 0.88): 400 tokens → in (total: 1,500)
Memory D (score 0.85): 300 tokens → in (total: 1,800)
Memory E (score 0.80): 400 tokens → SKIP (exceeds 2,000)
Result: 4 memories, 1,800 tokens. 4,996 memories never entered context.
```
---
## MCP Tools Reference
Once the MCP server is running, agents have these tools:
| Tool | Purpose | Key Parameters |
|------|---------|----------------|
| `memory_add` | Store a fact | `content`, `tags[]`, `category`, `entity`, `namespace`, `event_date`, `confidence` |
| `memory_recall` | Smart multi-layer retrieval | `query`, `budget` (default: 16000), `namespace` |
| `memory_search` | Filter with date ranges | `query`, `category`, `entity`, `namespace`, `status`, `after`, `before`, `limit` |
| `memory_get` | Fetch by ID | `memory_id` |
| `memory_forget` | Archive (soft delete) | `memory_id` |
| `memory_timeline` | Chronological entity history | `entity`, `namespace` |
| `memory_stats` | Store size, counts | — |
| `memory_health` | Health check (call first!) | — |
### Categories
`core_belief` · `preference` · `career` · `project` · `technical` · `personal` · `location` · `relationship` · `event` · `session` · `general`
### MCP Resources
- **`memwright://entity/{name}`** — Entity details + related entities from graph
- **`memwright://memory/{id}`** — Full memory object
### MCP Prompts
- **`recall`** — Search memories for relevant context
- **`timeline`** — Chronological history of an entity
---
## Retrieval Pipeline
The retrieval system uses a 3-layer cascade with multi-signal fusion:
```
Query: "deployment setup"
│
├─ Layer 0: Graph Expansion
│ Extract entities from query → BFS traversal (depth=2)
│ "deployment" → finds "AWS", "Docker", "Terraform" connections
│
├─ Layer 1: Tag Match (SQLite)
│ extract_tags(query) → tag_search() → score 1.0
│
├─ Layer 2: Entity-Field Search
│ Memories about graph-connected entities → score 0.5
│
├─ Layer 3: Vector Search (ChromaDB)
│ Semantic similarity → score = 1 - cosine_distance
│
├─ Layer 4: Graph Relation Triples
│ Inject relationship context → score 0.6
│
▼ FUSION
├─ Reciprocal Rank Fusion (RRF, k=60)
│ score = Σ 1/(k + rank_in_source)
│ OR Graph Blend: 0.7 * norm_vector + 0.3 * norm_pagerank
│
▼ SCORING
├─ Temporal Boost: +0.2 * max(0, 1 - age_days/90)
├─ Entity Boost: +0.30 exact match, +0.15 substring
├─ PageRank Boost: +0.3 * entity_pagerank_score
│
▼ DIVERSITY
├─ MMR Rerank: λ*relevance - (1-λ)*max_jaccard_similarity (λ=0.7)
│
▼ CONFIDENCE
├─ Time Decay: -0.001 per hour since last access
├─ Access Boost: +0.03 per access_count
├─ Clamp: [0.1, 1.0]
│
▼ BUDGET
└─ Greedy selection by score until token budget filled
```
Querying "Python" also finds memories about "FastAPI" if they're connected in the entity graph. Multi-hop reasoning through relationship traversal.
---
## Python API
### Basic Usage
```python
from agent_memory import AgentMemory
mem = AgentMemory("./my-agent") # auto-provisions all backends
# Store
mem.add("User prefers Python over Java",
tags=["preference", "coding"],
category="preference",
entity="Python")
# Recall with token budget
results = mem.recall("what language?", budget=2000)
# Formatted context for prompt injection
context = mem.recall_as_context("user background", budget=4000)
# Search with filters
memories = mem.search(category="project", entity="Python", limit=10)
# Timeline
history = mem.timeline("Python")
# Contradiction handling — automatic
mem.add("User works at Google", tags=["career"], category="career", entity="Google")
mem.add("User works at Meta", tags=["career"], category="career", entity="Meta")
# ^ Google memory auto-superseded
# Namespace isolation
mem.add("Team standup at 9am", namespace="team:alpha")
results = mem.recall("standup time", namespace="team:alpha")
# Maintenance
mem.forget(memory_id) # Archive
mem.forget_before("2025-01-01") # Archive old memories
mem.compact() # Permanently delete archived
mem.export_json("backup.json") # Export
mem.import_json("backup.json") # Import (dedup by content hash)
# Health & stats
mem.health() # → {sqlite: ok, chroma: ok, networkx: ok, retrieval: ok}
mem.stats() # → {total: 500, active: 480, ...}
# Context manager
with AgentMemory("./store") as mem:
mem.add("auto-closed on exit")
```
### Memory Object
```python
@dataclass
class Memory:
id: str # UUID
content: str # The actual fact/observation
tags: List[str] # Searchable tags
category: str # Classification (preference, career, project, ...)
entity: str # Primary entity (company, tool, person)
namespace: str # Isolation key (default: "default")
created_at: str # ISO timestamp
event_date: str # When the fact occurred
valid_from: str # Temporal validity start
valid_until: str # Set when superseded
superseded_by: str # ID of replacement memory
confidence: float # 0.0-1.0
status: str # active | superseded | archived
access_count: int # Times recalled
last_accessed: str # Last recall timestamp
content_hash: str # SHA-256 for dedup
metadata: Dict[str, Any] # Arbitrary JSON
```
---
## Multi-Agent Support
For multi-agent pipelines with provenance tracking, RBAC, and governance:
```python
from agent_memory.context import AgentContext, AgentRole, Visibility
# Create a root context
ctx = AgentContext.from_env(
agent_id="orchestrator",
namespace="project:acme",
role=AgentRole.ORCHESTRATOR,
token_budget=20000,
)
# Spawn child contexts for sub-agents (immutable — returns new instance)
planner = ctx.as_agent("planner", role=AgentRole.PLANNER, token_budget=5000)
researcher = ctx.as_agent("researcher", role=AgentRole.RESEARCHER, read_only=True)
# Provenance tracking — metadata auto-enriched
planner.add_memory("Architecture decision: use event sourcing",
category="technical", visibility=Visibility.TEAM)
# metadata includes: _agent_id, _session_id, _namespace, _visibility, _role
# Recall is scoped to namespace + cached within session
results = researcher.recall("architecture decisions")
# Token budget tracked
print(researcher.token_budget - researcher.token_budget_used)
# Governance
researcher.flag_for_review("Need human approval for deployment plan")
researcher.add_compliance_tag("SOC2")
# Session introspection
summary = ctx.session_summary()
# → {agent_trail, memories_written, memories_recalled, token_usage, review_flags}
```
### AgentContext Features
| Feature | Description |
|---------|-------------|
| **Namespace isolation** | Each agent/project gets isolated memory partition |
| **RBAC roles** | ORCHESTRATOR, PLANNER, EXECUTOR, RESEARCHER, REVIEWER, MONITOR |
| **Read-only mode** | Agents can recall but not write |
| **Write quotas** | `max_writes_per_agent` (default: 100) |
| **Token budgets** | Per-agent budget tracking |
| **Recall cache** | Dedup redundant queries within a session |
| **Scratchpad** | Inter-agent data passing |
| **Provenance** | Agent trail, parent tracking, visibility levels |
| **Compliance** | Review flags, compliance tags for audit |
| **Distributed mode** | Set `memory_url` to use HTTP client instead of local |
---
## Cloud Backends
Each cloud backend fills all three roles (document, vector, graph) in a single service:
### PostgreSQL (Neon, Cloud SQL, self-hosted)
Uses pgvector for vectors, Apache AGE for graph. AGE is optional — without it, graph gracefully degrades.
```python
mem = AgentMemory("./store", config={
"backends": ["postgres"],
"postgres": {"url": "postgresql://user:pass@host:5432/memwright"}
})
```
### ArangoDB (ArangoGraph Cloud, Docker)
Native document, vector, and graph support in one database.
```python
mem = AgentMemory("./store", config={
"backends": ["arangodb"],
"arangodb": {"url": "https://instance.arangodb.cloud:8529", "database": "memwright"}
})
```
### Azure (Cosmos DB)
Cosmos DB with DiskANN vector indexing. Graph via NetworkX persisted to Cosmos containers.
```python
mem = AgentMemory("./store", config={
"backends": ["azure"],
"azure": {"cosmos_endpoint": "https://account.documents.azure.com:443/"}
})
```
### GCP (AlloyDB)
Extends PostgreSQL backend with AlloyDB Connector (IAM auth) and Vertex AI embeddings (768D).
```python
mem = AgentMemory("./store", config={
"backends": ["gcp"],
"gcp": {"project_id": "my-project", "cluster": "memwright", "instance": "primary"}
})
```
### Installing cloud extras
```bash
poetry add "memwright[postgres]" # PostgreSQL
poetry add "memwright[arangodb]" # ArangoDB
poetry add "memwright[aws]" # AWS (DynamoDB + OpenSearch + Neptune)
poetry add "memwright[azure]" # Azure Cosmos DB
poetry add "memwright[gcp]" # GCP AlloyDB + Vertex AI
poetry add "memwright[all]" # Everything
```
---
## Cloud Deployment
Deploy Memwright as an HTTP API on any cloud with a single command:
```bash
./scripts/deploy.sh aws # App Runner (2 CPU / 4GB, auto-scale)
./scripts/deploy.sh gcp # Cloud Run (auto-scale 0–3, 2 CPU / 4GB)
./scripts/deploy.sh azure # Container Apps (scale-to-zero, 2 CPU / 4GB)
./scripts/deploy.sh aws --teardown # Destroy everything
```
**Prerequisites**: Docker, Terraform, cloud CLI (`aws`/`gcloud`/`az`), backend credentials in `.env`.
| Cloud | Infrastructure | Terraform |
|-------|---------------|-----------|
| AWS | ECR + App Runner (2 CPU, 4GB) | `agent_memory/infra/apprunner/main.tf` |
| GCP | Artifact Registry + Cloud Run (2 CPU, 4GB) | `agent_memory/infra/cloudrun/main.tf` |
| Azure | ACR + Log Analytics + Container Apps (2 CPU, 4GB) | `agent_memory/infra/containerapp/main.tf` |
### REST API Endpoints
All deployments expose the same Starlette ASGI API:
| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/health` | Component health check |
| `GET` | `/stats` | Store statistics |
| `POST` | `/add` | Add a memory |
| `POST` | `/recall` | Smart retrieval with budget |
| `POST` | `/search` | Filtered search |
| `POST` | `/timeline` | Entity chronological history |
| `POST` | `/forget` | Archive a memory |
| `GET` | `/memory/{id}` | Get memory by ID |
Response envelope: `{"ok": true, "data": {...}}` or `{"ok": false, "error": "message"}`
---
## Embedding Providers
Memwright auto-detects the best available embedding provider:
| Priority | Provider | Model | Dimensions | Trigger |
|----------|----------|-------|------------|---------|
| 1 | Cloud-native | Bedrock Titan / Azure OpenAI / Vertex AI | 768-1536 | Cloud backend configured |
| 2 | OpenAI / OpenRouter | text-embedding-3-small | 1536 | `OPENAI_API_KEY` or `OPENROUTER_API_KEY` set |
| 3 | Local (default) | all-MiniLM-L6-v2 | 384 | Always available, no API key |
The local fallback downloads ~90MB on first use. All providers implement the same interface — switching is transparent.
---
## CLI Reference
Both `memwright` and `agent-memory` work as entry points:
### MCP Server
```bash
memwright mcp # Start MCP server (uses ~/.memwright)
memwright mcp --path /custom/path # Custom store location
```
### Memory Operations
```bash
agent-memory add ./store "User prefers Python" --tags "pref,coding" --category preference --namespace default
agent-memory recall ./store "what language?" --budget 4000 --namespace default
agent-memory search ./store --category project --entity Python --namespace default --limit 20
agent-memory list ./store --status active --category technical --namespace default
agent-memory timeline ./store --entity Python --namespace default
agent-memory update ./store "Updated content" --tags "new,tags" --category technical
agent-memory forget ./store
```
### Maintenance
```bash
agent-memory doctor ~/.memwright # Health check (SQLite, ChromaDB, NetworkX, Retrieval)
agent-memory stats ./store # Memory counts, DB size, breakdowns
agent-memory export ./store -o backup.json
agent-memory import ./store backup.json
agent-memory compact ./store # Permanently delete archived memories
agent-memory inspect ./store # Raw DB inspection
```
### Lifecycle Hooks (Claude Code)
```bash
memwright hook session-start # Inject context at session start
memwright hook post-tool-use # Auto-capture tool observations
memwright hook stop # Generate session summary
```
### Benchmarks
```bash
agent-memory locomo --max-conversations 5 --verbose
agent-memory mab --categories AR,CR --max-examples 10
```
---
## Configuration
### Store location
Default: `~/.memwright/`. Configurable with `--path` on any CLI command.
```
~/.memwright/
├── memory.db # SQLite database (core storage)
├── config.json # Retrieval tuning parameters
├── graph.json # NetworkX entity graph
└── chroma/ # ChromaDB vector store + embeddings
```
### config.json
All fields optional. Defaults apply if the file doesn't exist:
```json
{
"default_token_budget": 16000,
"min_results": 3,
"backends": ["sqlite", "chroma", "networkx"],
"enable_mmr": true,
"mmr_lambda": 0.7,
"fusion_mode": "rrf",
"confidence_gate": 0.0,
"confidence_decay_rate": 0.001,
"confidence_boost_rate": 0.03
}
```
| Parameter | Default | Description |
|-----------|---------|-------------|
| `default_token_budget` | 16000 | Max tokens returned per recall (start high, lower to tune) |
| `min_results` | 3 | Minimum results to return |
| `enable_mmr` | true | Maximal Marginal Relevance diversity reranking |
| `mmr_lambda` | 0.7 | Relevance vs diversity balance (0=diverse, 1=relevant) |
| `fusion_mode` | "rrf" | "rrf" (parameter-free) or "graph_blend" (weighted) |
| `confidence_decay_rate` | 0.001 | Score penalty per hour since last access |
| `confidence_boost_rate` | 0.03 | Score boost per access count |
| `confidence_gate` | 0.0 | Minimum confidence threshold to include in results |
### Environment Variables
| Variable | Purpose |
|----------|---------|
| `MEMWRIGHT_PATH` | Default store path |
| `MEMWRIGHT_URL` | Remote API URL (distributed mode) |
| `MEMWRIGHT_NAMESPACE` | Default namespace |
| `MEMWRIGHT_TOKEN_BUDGET` | Default token budget |
| `MEMWRIGHT_SESSION_ID` | Session ID for provenance tracking |
---
## Testing
### Running Tests
```bash
# All unit tests — no Docker, no API keys
poetry run pytest tests/ -v
# With coverage
poetry run pytest tests/ -v --cov=agent_memory --cov-report=term-missing
# Live integration tests (need credentials)
NEON_DATABASE_URL='postgresql://...' poetry run pytest tests/test_postgres_live.py -v
AZURE_COSMOS_ENDPOINT='https://...' poetry run pytest tests/test_azure_live.py -v
```
### Test Coverage
- **607 unit tests** covering all backends, retrieval, config, embeddings, and CLI
- **14 live integration tests** per cloud backend (Neon, Azure, ArangoDB)
- **Mock tests** for every cloud backend — no cloud account needed
- All unit tests run without Docker or API keys
---
## Benchmarks
### Latency (P50 recall — the core operation)
| Backend | Stack | P50 | P95 | P99 |
|---|---|---|---|---|
| **PG + pgvector + AGE (Docker)** | PostgreSQL 16 + pgvector + Apache AGE | **1.4ms** | **5.5ms** | **39ms** |
| SQLite + ChromaDB + NetworkX (local) | SQLite 3 + ChromaDB 1.x + NetworkX 3 | 9.1ms | 31ms | 75ms |
| ArangoDB (Docker) | ArangoDB 3.12 (doc + vector + graph) | 40ms | 57ms | 68ms |
| GCP Cloud Run (us-central1) | Starlette + Uvicorn → ArangoDB Oasis | 156ms | 245ms | 271ms |
| Azure Container Apps (eastus) | Starlette + Uvicorn → ArangoDB Oasis | 293ms | 466ms | 480ms |
| AWS App Runner (us-west-2) | Starlette + Uvicorn → ArangoDB Oasis | 621ms | 792ms | 813ms |
### vs. Competitors (recall P50)
| System | Stack | P50 | Notes |
|---|---|---|---|
| **Memwright (PG Docker)** | PG 16 + pgvector + AGE | **1.4ms** | Full 3-layer pipeline, 81.2% LOCOMO |
| Ruflo | In-process HNSW | 2-3ms | Vector lookup only, not full retrieval |
| **Memwright (local)** | SQLite + ChromaDB + NX | **9.1ms** | Zero-config, no Docker, no API keys |
| **Memwright (GCP Cloud Run)** | Starlette → ArangoDB Oasis | **156ms** | Full cloud API, scale-to-zero |
| Mem0 | Cloud + LLM judge | 200ms | LLM in retrieval path |
| Zep | Neo4j + embeddings | <200ms | P95 ~632ms under concurrency |
| Mem0 Graph | Cloud + LLM + graph | 660ms | Graph variant, much slower |
Full results with add/search latency: [docs/LATENCY_BENCHMARKS.md](docs/LATENCY_BENCHMARKS.md)
### LOCOMO (Long Conversation Memory)
| System | Accuracy |
|--------|----------|
| MemMachine | 84.9% |
| **Memwright** | **81.2%** |
| Zep | ~75% |
| Letta | 74.0% |
| Mem0 (Graph) | 66.9% |
| OpenAI Memory | 52.9% |
*Scores are self-reported across vendors. [Methodology is disputed](https://blog.getzep.com/lies-damn-lies-statistics-is-mem0-really-sota-in-agent-memory/).*
Retrieval is fully local — tag matching, graph traversal, vector search with RRF fusion. No LLM re-ranking. Only benchmark answer synthesis uses an LLM.
---
## Compatibility
### MCP Clients
| Client | Config File |
|--------|-------------|
| Claude Code | `.mcp.json` (project) or `~/.claude/.mcp.json` (global) |
| Cursor | `.cursor/mcp.json` |
| Windsurf | MCP config in settings |
| Any MCP client | Standard MCP stdio transport |
Same `memwright mcp` command. Same zero-config setup.
### Python
- Python 3.10, 3.11, 3.12, 3.13, 3.14
---
## Uninstall
### 1. Remove MCP server config
Delete the `memory` entry from `~/.claude/.mcp.json` (global) or `.mcp.json` (per-project).
### 2. Uninstall the package
```bash
# Match your install method:
uv tool uninstall memwright # if installed with uv
pipx uninstall memwright # if installed with pipx
pip uninstall memwright # if installed with pip
poetry remove memwright # if installed with poetry
```
### 3. Delete stored memories (optional)
```bash
# Export first if you want a backup
agent-memory export ~/.memwright -o memwright-backup.json
# Then delete
rm -rf ~/.memwright
```
---
## License
Apache 2.0
---
mcp-name: io.github.bolnet/memwright