{"id":47613706,"url":"https://github.com/bolnet/agent-memory","last_synced_at":"2026-04-01T20:53:51.795Z","repository":{"id":342909699,"uuid":"1175503666","full_name":"bolnet/agent-memory","owner":"bolnet","description":"Embedded memory for AI agents. SQLite + pgvector + Neo4j. Sub-5ms retrieval.","archived":false,"fork":false,"pushed_at":"2026-03-25T20:39:03.000Z","size":2652,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-26T03:03:45.852Z","etag":null,"topics":["agents","ai","claude-code","llm","mcp","memory","neo4j","pgvector","sqlite"],"latest_commit_sha":null,"homepage":"https://bolnet.github.io/agent-memory/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bolnet.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-07T19:55:30.000Z","updated_at":"2026-03-25T20:38:54.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/bolnet/agent-memory","commit_stats":null,"previous_names":["bolnet/agent-memory"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/bolnet/agent-memory","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bolnet%2Fagent-memory","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bolnet%2Fagent-memory/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bolnet%2Fagent-memory/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bolnet%2Fagent-memory/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bolnet","download_url":"https://codeload.github.com/bolnet/agent-memory/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bolnet%2Fagent-memory/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31291873,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-01T13:12:26.723Z","status":"ssl_error","status_checked_at":"2026-04-01T13:12:25.102Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agents","ai","claude-code","llm","mcp","memory","neo4j","pgvector","sqlite"],"created_at":"2026-04-01T20:53:48.880Z","updated_at":"2026-04-01T20:53:51.775Z","avatar_url":"https://github.com/bolnet.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"docs/logo.svg\"\u003e\n    \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"docs/logo-dark.svg\"\u003e\n    \u003cimg alt=\"Memwright\" src=\"docs/logo.svg\" width=\"400\"\u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cem\u003eZero-config memory for AI agents. No Docker. No API keys. Just install and go.\u003c/em\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://pypi.org/project/memwright/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/memwright?color=C15F3C\u0026style=flat-square\" alt=\"PyPI\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/memwright/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/pyversions/memwright?style=flat-square\" alt=\"Python\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/bolnet/agent-memory/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/github/license/bolnet/agent-memory?style=flat-square\" alt=\"License\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://registry.modelcontextprotocol.io/servers/io.github.bolnet/memwright\"\u003e\u003cimg src=\"https://img.shields.io/badge/MCP-Registry-C15F3C?style=flat-square\" alt=\"MCP Registry\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\n## The Problem\n\nAI agents forget everything between sessions. Every new conversation starts from zero — no memory of what you built yesterday, what decisions you made, or what your project even does.\n\nBuilt-in memory solutions (like Claude Code's `MEMORY.md`) store flat files that load entirely into the context window every message. No search, no ranking, no contradiction handling. As your project grows, those files become a wall of text that burns tokens without helping.\n\n## What Memwright Does\n\nMemwright gives AI agents persistent, searchable memory that stays out of the context window until needed:\n\n- **Ranked retrieval** — 3-layer search (tags + entity graph + vector similarity) returns only the most relevant memories\n- **Token budgets** — Set a ceiling (e.g. 2,000 tokens). Memwright fits the best memories within that budget\n- **Contradiction handling** — \"User works at Google\" automatically supersedes \"User works at Meta\"\n- **Namespace isolation** — Multi-agent systems get isolated memory partitions per agent, user, or project\n- **Zero config** — `poetry add memwright`, add one JSON block, done\n\n---\n\n## Table of Contents\n\n- [Quick Start](#quick-start)\n- [Architecture](#architecture)\n- [How It Works](#how-it-works)\n- [MCP Tools Reference](#mcp-tools-reference)\n- [Retrieval Pipeline](#retrieval-pipeline)\n- [Python API](#python-api)\n- [Multi-Agent Support](#multi-agent-support)\n- [Cloud Backends](#cloud-backends)\n- [Cloud Deployment](#cloud-deployment)\n- [Embedding Providers](#embedding-providers)\n- [CLI Reference](#cli-reference)\n- [Configuration](#configuration)\n- [Testing](#testing)\n- [Benchmarks](#benchmarks)\n- [Compatibility](#compatibility)\n- [Uninstall](#uninstall)\n\n---\n\n## Quick Start\n\n### Step 1: Install\n\nChoose one method. The package name is `memwright` on PyPI.\n\n```bash\n# Option A: uv (recommended on macOS)\nuv tool install memwright\n\n# Option B: pipx\npipx install memwright\n\n# Option C: pip (in a venv or with --user)\npip install memwright\n\n# Option D: poetry (add to an existing project)\npoetry add memwright\n```\n\n\u003e **First run downloads ~90MB** for the local embedding model (all-MiniLM-L6-v2). This happens once and is cached.\n\n### Step 2: Connect to Claude Code\n\n```bash\nclaude mcp add memory -- memwright mcp\n```\n\nRestart Claude Code. Approve the MCP server once. Done — Claude now has 8 memory tools.\n\n**Alternative: manual MCP config.** Add to `~/.claude/.mcp.json` (global) or `.mcp.json` (per-project):\n\n```json\n{\n  \"mcpServers\": {\n    \"memory\": {\n      \"command\": \"memwright\",\n      \"args\": [\"mcp\"]\n    }\n  }\n}\n```\n\n### Step 3: Verify\n\n```bash\nmemwright doctor ~/.memwright\n```\n\nAll 4 components should report healthy:\n\n```\nOverall: ALL HEALTHY\n\n  [OK] SQLiteStore (0.2ms, 0 memories, 4,096 bytes)\n  [OK] ChromaStore (0 vectors)\n  [OK] NetworkXGraph (0 nodes, 0 edges)\n  [OK] Retrieval Pipeline (3/3 layers)\n```\n\nOr ask Claude to call `memory_health` from within a session.\n\n### Step 4 (optional): Enable lifecycle hooks\n\n```bash\nmemwright init ~/.memwright --hooks\n```\n\nThis auto-configures three Claude Code hooks in `~/.claude/settings.json`:\n- **SessionStart** — injects relevant memories into context (20K token budget)\n- **PostToolUse** — auto-captures file changes and command outputs\n- **Stop** — generates a session summary\n\n---\n\n### Quick test from the CLI\n\n```bash\n# Add a memory\nmemwright add ~/.memwright \"Project uses Python 3.12 with FastAPI\" \\\n  --tags \"python,fastapi\" --category project\n\n# Recall it\nmemwright recall ~/.memwright \"what does the project use?\"\n\n# Search by category\nmemwright search ~/.memwright --category project\n\n# Update a memory\nmemwright update ~/.memwright \u003cmemory-id\u003e \"Project uses Python 3.13 with FastAPI\" \\\n  --tags \"python,fastapi\"\n\n# Check stats\nmemwright stats ~/.memwright\n```\n\n\u003e **No API keys required.** Memwright uses a local embedding model — no HuggingFace token, no OpenAI key, no cloud account. The `HF_TOKEN` warning you may see in older versions is harmless noise and has been suppressed.\n\n---\n\n## Architecture\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/architecture.svg\" alt=\"Memwright Architecture\" width=\"100%\"\u003e\n\u003c/p\u003e\n\n### Component Overview\n\n```\nagent_memory/\n├── core.py                    # AgentMemory — main orchestrator\n├── models.py                  # Memory + RetrievalResult dataclasses\n├── context.py                 # AgentContext — multi-agent provenance \u0026 RBAC\n├── client.py                  # MemoryClient — HTTP client for distributed mode\n├── cli.py                     # CLI entry point (19 commands)\n├── api.py                     # Starlette ASGI REST API (8 routes)\n├── store/\n│   ├── base.py                # Abstract interfaces: DocumentStore, VectorStore, GraphStore\n│   ├── sqlite_store.py        # SQLite storage (WAL, 17 columns, 8 indexes)\n│   ├── chroma_store.py        # ChromaDB vector search (local sentence-transformers)\n│   ├── schema.sql             # SQLite schema definition\n│   ├── postgres_backend.py    # PostgreSQL (pgvector + Apache AGE)\n│   ├── arango_backend.py      # ArangoDB (native doc + vector + graph)\n│   ├── aws_backend.py         # AWS (DynamoDB + OpenSearch + Neptune)\n│   └── azure_backend.py       # Azure (Cosmos DB DiskANN + NetworkX)\n├── graph/\n│   ├── networkx_graph.py      # NetworkX MultiDiGraph with PageRank + BFS\n│   └── extractor.py           # Entity/relation extraction (50+ known tools)\n├── retrieval/\n│   ├── orchestrator.py        # 3-layer cascade with RRF fusion\n│   ├── tag_matcher.py         # Stop-word filtered tag extraction\n│   └── scorer.py              # Temporal, entity, PageRank, MMR, confidence decay\n├── temporal/\n│   └── manager.py             # Contradiction detection + supersession\n├── extraction/\n│   └── extractor.py           # Rule-based + LLM memory extraction\n├── mcp/\n│   └── server.py              # MCP server (8 tools, 2 resources, 2 prompts)\n├── hooks/\n│   ├── session_start.py       # Context injection (20K token budget)\n│   ├── post_tool_use.py       # Auto-capture from Write/Edit/Bash\n│   └── stop.py                # Session summary generation\n├── utils/\n│   └── config.py              # MemoryConfig dataclass + load/save\n└── infra/                     # Terraform + Docker for cloud deployments\n    ├── apprunner/             # AWS App Runner\n    ├── cloudrun/              # GCP Cloud Run\n    └── containerapp/          # Azure Container Apps\n```\n\n### Three Storage Roles\n\nEvery backend implements one or more of these roles:\n\n| Role | Purpose | Local Default | Cloud Options |\n|------|---------|--------------|---------------|\n| **Document** | Core storage, CRUD, filtering | SQLite | PostgreSQL, ArangoDB, DynamoDB, Cosmos DB |\n| **Vector** | Semantic similarity search | ChromaDB | pgvector, ArangoDB, OpenSearch, Cosmos DiskANN |\n| **Graph** | Entity relationships, BFS traversal | NetworkX | Apache AGE, ArangoDB, Neptune |\n\nCloud backends fill all 3 roles in a single service. If any optional component fails, the system degrades gracefully to document-only.\n\n---\n\n## How It Works\n\n### Memory lives outside the context window\n\nThis is the key difference. Flat-file memory loads everything into context every message. Memwright stores memories in a separate process (SQLite + ChromaDB + NetworkX on disk). The context window never sees them until the agent explicitly asks.\n\n```\nFlat-file memory:                    Memwright:\n\n┌──────────────────────────┐        ┌──────────────────────────┐\n│  Context Window          │        │  Context Window          │\n│                          │        │                          │\n│  System prompt           │        │  System prompt           │\n│  MEMORY.md ← ALL of it  │        │  User message            │\n│  grows forever           │        │  memory_recall → 2K max  │\n│  User message            │        │                          │\n└──────────────────────────┘        └──────────────────────────┘\n\n                                    ┌──────────────────────────┐\n                                    │  Memwright (on disk)     │\n                                    │  10,000+ memories        │\n                                    │  ← never in context     │\n                                    └──────────────────────────┘\n```\n\n### Token cost stays flat as memory grows\n\n```\nFlat-file approach:\n  Month 1:   2K tokens loaded every message\n  Month 6:  15K tokens loaded every message  ← context crowded\n\nMemwright approach:\n  Month 1:   2K tokens max when recalled (ranking from 100 memories)\n  Month 6:   2K tokens max when recalled (ranking from 5,000 memories)\n                                             ← same cost, better results\n```\n\nMore stored memories makes retrieval *better* — more candidates to rank — while context cost stays constant.\n\n### How a recall works\n\nWhen an agent calls `memory_recall(\"deployment setup\", budget=2000)`:\n\n```\nStore: 5,000 memories\n\n  Tag search finds:     15 memories tagged \"deployment\"\n  Graph search finds:    8 memories linked to \"AWS\", \"Docker\" entities\n  Vector search finds:  20 semantically similar memories\n\n  After dedup + RRF fusion:  30 unique candidates, scored and ranked\n\n  Budget fitting (2,000 tokens):\n    Memory A (score 0.95):  500 tokens → in   (total: 500)\n    Memory B (score 0.90):  600 tokens → in   (total: 1,100)\n    Memory C (score 0.88):  400 tokens → in   (total: 1,500)\n    Memory D (score 0.85):  300 tokens → in   (total: 1,800)\n    Memory E (score 0.80):  400 tokens → SKIP (exceeds 2,000)\n\n  Result: 4 memories, 1,800 tokens. 4,996 memories never entered context.\n```\n\n---\n\n## MCP Tools Reference\n\nOnce the MCP server is running, agents have these tools:\n\n| Tool | Purpose | Key Parameters |\n|------|---------|----------------|\n| `memory_add` | Store a fact | `content`, `tags[]`, `category`, `entity`, `namespace`, `event_date`, `confidence` |\n| `memory_recall` | Smart multi-layer retrieval | `query`, `budget` (default: 16000), `namespace` |\n| `memory_search` | Filter with date ranges | `query`, `category`, `entity`, `namespace`, `status`, `after`, `before`, `limit` |\n| `memory_get` | Fetch by ID | `memory_id` |\n| `memory_forget` | Archive (soft delete) | `memory_id` |\n| `memory_timeline` | Chronological entity history | `entity`, `namespace` |\n| `memory_stats` | Store size, counts | — |\n| `memory_health` | Health check (call first!) | — |\n\n### Categories\n\n`core_belief` · `preference` · `career` · `project` · `technical` · `personal` · `location` · `relationship` · `event` · `session` · `general`\n\n### MCP Resources\n\n- **`memwright://entity/{name}`** — Entity details + related entities from graph\n- **`memwright://memory/{id}`** — Full memory object\n\n### MCP Prompts\n\n- **`recall`** — Search memories for relevant context\n- **`timeline`** — Chronological history of an entity\n\n---\n\n## Retrieval Pipeline\n\nThe retrieval system uses a 3-layer cascade with multi-signal fusion:\n\n```\nQuery: \"deployment setup\"\n  │\n  ├─ Layer 0: Graph Expansion\n  │  Extract entities from query → BFS traversal (depth=2)\n  │  \"deployment\" → finds \"AWS\", \"Docker\", \"Terraform\" connections\n  │\n  ├─ Layer 1: Tag Match (SQLite)\n  │  extract_tags(query) → tag_search() → score 1.0\n  │\n  ├─ Layer 2: Entity-Field Search\n  │  Memories about graph-connected entities → score 0.5\n  │\n  ├─ Layer 3: Vector Search (ChromaDB)\n  │  Semantic similarity → score = 1 - cosine_distance\n  │\n  ├─ Layer 4: Graph Relation Triples\n  │  Inject relationship context → score 0.6\n  │\n  ▼ FUSION\n  ├─ Reciprocal Rank Fusion (RRF, k=60)\n  │  score = Σ 1/(k + rank_in_source)\n  │  OR Graph Blend: 0.7 * norm_vector + 0.3 * norm_pagerank\n  │\n  ▼ SCORING\n  ├─ Temporal Boost: +0.2 * max(0, 1 - age_days/90)\n  ├─ Entity Boost:   +0.30 exact match, +0.15 substring\n  ├─ PageRank Boost:  +0.3 * entity_pagerank_score\n  │\n  ▼ DIVERSITY\n  ├─ MMR Rerank: λ*relevance - (1-λ)*max_jaccard_similarity (λ=0.7)\n  │\n  ▼ CONFIDENCE\n  ├─ Time Decay:    -0.001 per hour since last access\n  ├─ Access Boost:  +0.03 per access_count\n  ├─ Clamp:         [0.1, 1.0]\n  │\n  ▼ BUDGET\n  └─ Greedy selection by score until token budget filled\n```\n\nQuerying \"Python\" also finds memories about \"FastAPI\" if they're connected in the entity graph. Multi-hop reasoning through relationship traversal.\n\n---\n\n## Python API\n\n### Basic Usage\n\n```python\nfrom agent_memory import AgentMemory\n\nmem = AgentMemory(\"./my-agent\")  # auto-provisions all backends\n\n# Store\nmem.add(\"User prefers Python over Java\",\n        tags=[\"preference\", \"coding\"],\n        category=\"preference\",\n        entity=\"Python\")\n\n# Recall with token budget\nresults = mem.recall(\"what language?\", budget=2000)\n\n# Formatted context for prompt injection\ncontext = mem.recall_as_context(\"user background\", budget=4000)\n\n# Search with filters\nmemories = mem.search(category=\"project\", entity=\"Python\", limit=10)\n\n# Timeline\nhistory = mem.timeline(\"Python\")\n\n# Contradiction handling — automatic\nmem.add(\"User works at Google\", tags=[\"career\"], category=\"career\", entity=\"Google\")\nmem.add(\"User works at Meta\", tags=[\"career\"], category=\"career\", entity=\"Meta\")\n# ^ Google memory auto-superseded\n\n# Namespace isolation\nmem.add(\"Team standup at 9am\", namespace=\"team:alpha\")\nresults = mem.recall(\"standup time\", namespace=\"team:alpha\")\n\n# Maintenance\nmem.forget(memory_id)             # Archive\nmem.forget_before(\"2025-01-01\")   # Archive old memories\nmem.compact()                     # Permanently delete archived\nmem.export_json(\"backup.json\")    # Export\nmem.import_json(\"backup.json\")    # Import (dedup by content hash)\n\n# Health \u0026 stats\nmem.health()  # → {sqlite: ok, chroma: ok, networkx: ok, retrieval: ok}\nmem.stats()   # → {total: 500, active: 480, ...}\n\n# Context manager\nwith AgentMemory(\"./store\") as mem:\n    mem.add(\"auto-closed on exit\")\n```\n\n### Memory Object\n\n```python\n@dataclass\nclass Memory:\n    id: str                    # UUID\n    content: str               # The actual fact/observation\n    tags: List[str]            # Searchable tags\n    category: str              # Classification (preference, career, project, ...)\n    entity: str                # Primary entity (company, tool, person)\n    namespace: str             # Isolation key (default: \"default\")\n    created_at: str            # ISO timestamp\n    event_date: str            # When the fact occurred\n    valid_from: str            # Temporal validity start\n    valid_until: str           # Set when superseded\n    superseded_by: str         # ID of replacement memory\n    confidence: float          # 0.0-1.0\n    status: str                # active | superseded | archived\n    access_count: int          # Times recalled\n    last_accessed: str         # Last recall timestamp\n    content_hash: str          # SHA-256 for dedup\n    metadata: Dict[str, Any]   # Arbitrary JSON\n```\n\n---\n\n## Multi-Agent Support\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/multi-agent-architecture.svg\" alt=\"Multi-Agent Memory Architecture\" width=\"100%\"\u003e\n\u003c/p\u003e\n\nFor multi-agent pipelines with provenance tracking, RBAC, and governance:\n\n```python\nfrom agent_memory.context import AgentContext, AgentRole, Visibility\n\n# Create a root context\nctx = AgentContext.from_env(\n    agent_id=\"orchestrator\",\n    namespace=\"project:acme\",\n    role=AgentRole.ORCHESTRATOR,\n    token_budget=20000,\n)\n\n# Spawn child contexts for sub-agents (immutable — returns new instance)\nplanner = ctx.as_agent(\"planner\", role=AgentRole.PLANNER, token_budget=5000)\nresearcher = ctx.as_agent(\"researcher\", role=AgentRole.RESEARCHER, read_only=True)\n\n# Provenance tracking — metadata auto-enriched\nplanner.add_memory(\"Architecture decision: use event sourcing\",\n                   category=\"technical\", visibility=Visibility.TEAM)\n# metadata includes: _agent_id, _session_id, _namespace, _visibility, _role\n\n# Recall is scoped to namespace + cached within session\nresults = researcher.recall(\"architecture decisions\")\n\n# Token budget tracked\nprint(researcher.token_budget - researcher.token_budget_used)\n\n# Governance\nresearcher.flag_for_review(\"Need human approval for deployment plan\")\nresearcher.add_compliance_tag(\"SOC2\")\n\n# Session introspection\nsummary = ctx.session_summary()\n# → {agent_trail, memories_written, memories_recalled, token_usage, review_flags}\n```\n\n### AgentContext Features\n\n| Feature | Description |\n|---------|-------------|\n| **Namespace isolation** | Each agent/project gets isolated memory partition |\n| **RBAC roles** | ORCHESTRATOR, PLANNER, EXECUTOR, RESEARCHER, REVIEWER, MONITOR |\n| **Read-only mode** | Agents can recall but not write |\n| **Write quotas** | `max_writes_per_agent` (default: 100) |\n| **Token budgets** | Per-agent budget tracking |\n| **Recall cache** | Dedup redundant queries within a session |\n| **Scratchpad** | Inter-agent data passing |\n| **Provenance** | Agent trail, parent tracking, visibility levels |\n| **Compliance** | Review flags, compliance tags for audit |\n| **Distributed mode** | Set `memory_url` to use HTTP client instead of local |\n\n---\n\n## Cloud Backends\n\nEach cloud backend fills all three roles (document, vector, graph) in a single service:\n\n### PostgreSQL (Neon, Cloud SQL, self-hosted)\n\nUses pgvector for vectors, Apache AGE for graph. AGE is optional — without it, graph gracefully degrades.\n\n```python\nmem = AgentMemory(\"./store\", config={\n    \"backends\": [\"postgres\"],\n    \"postgres\": {\"url\": \"postgresql://user:pass@host:5432/memwright\"}\n})\n```\n\n### ArangoDB (ArangoGraph Cloud, Docker)\n\nNative document, vector, and graph support in one database.\n\n```python\nmem = AgentMemory(\"./store\", config={\n    \"backends\": [\"arangodb\"],\n    \"arangodb\": {\"url\": \"https://instance.arangodb.cloud:8529\", \"database\": \"memwright\"}\n})\n```\n\n### Azure (Cosmos DB)\n\nCosmos DB with DiskANN vector indexing. Graph via NetworkX persisted to Cosmos containers.\n\n```python\nmem = AgentMemory(\"./store\", config={\n    \"backends\": [\"azure\"],\n    \"azure\": {\"cosmos_endpoint\": \"https://account.documents.azure.com:443/\"}\n})\n```\n\n### GCP (AlloyDB)\n\nExtends PostgreSQL backend with AlloyDB Connector (IAM auth) and Vertex AI embeddings (768D).\n\n```python\nmem = AgentMemory(\"./store\", config={\n    \"backends\": [\"gcp\"],\n    \"gcp\": {\"project_id\": \"my-project\", \"cluster\": \"memwright\", \"instance\": \"primary\"}\n})\n```\n\n### Installing cloud extras\n\n```bash\npoetry add \"memwright[postgres]\"    # PostgreSQL\npoetry add \"memwright[arangodb]\"    # ArangoDB\npoetry add \"memwright[aws]\"         # AWS (DynamoDB + OpenSearch + Neptune)\npoetry add \"memwright[azure]\"       # Azure Cosmos DB\npoetry add \"memwright[gcp]\"         # GCP AlloyDB + Vertex AI\npoetry add \"memwright[all]\"         # Everything\n```\n\n---\n\n## Cloud Deployment\n\nDeploy Memwright as an HTTP API on any cloud with a single command:\n\n```bash\n./scripts/deploy.sh aws        # App Runner (2 CPU / 4GB, auto-scale)\n./scripts/deploy.sh gcp        # Cloud Run (auto-scale 0–3, 2 CPU / 4GB)\n./scripts/deploy.sh azure      # Container Apps (scale-to-zero, 2 CPU / 4GB)\n\n./scripts/deploy.sh aws --teardown   # Destroy everything\n```\n\n**Prerequisites**: Docker, Terraform, cloud CLI (`aws`/`gcloud`/`az`), backend credentials in `.env`.\n\n| Cloud | Infrastructure | Terraform |\n|-------|---------------|-----------|\n| AWS | ECR + App Runner (2 CPU, 4GB) | `agent_memory/infra/apprunner/main.tf` |\n| GCP | Artifact Registry + Cloud Run (2 CPU, 4GB) | `agent_memory/infra/cloudrun/main.tf` |\n| Azure | ACR + Log Analytics + Container Apps (2 CPU, 4GB) | `agent_memory/infra/containerapp/main.tf` |\n\n### REST API Endpoints\n\nAll deployments expose the same Starlette ASGI API:\n\n| Method | Endpoint | Description |\n|--------|----------|-------------|\n| `GET` | `/health` | Component health check |\n| `GET` | `/stats` | Store statistics |\n| `POST` | `/add` | Add a memory |\n| `POST` | `/recall` | Smart retrieval with budget |\n| `POST` | `/search` | Filtered search |\n| `POST` | `/timeline` | Entity chronological history |\n| `POST` | `/forget` | Archive a memory |\n| `GET` | `/memory/{id}` | Get memory by ID |\n\nResponse envelope: `{\"ok\": true, \"data\": {...}}` or `{\"ok\": false, \"error\": \"message\"}`\n\n---\n\n## Embedding Providers\n\nMemwright auto-detects the best available embedding provider:\n\n| Priority | Provider | Model | Dimensions | Trigger |\n|----------|----------|-------|------------|---------|\n| 1 | Cloud-native | Bedrock Titan / Azure OpenAI / Vertex AI | 768-1536 | Cloud backend configured |\n| 2 | OpenAI / OpenRouter | text-embedding-3-small | 1536 | `OPENAI_API_KEY` or `OPENROUTER_API_KEY` set |\n| 3 | Local (default) | all-MiniLM-L6-v2 | 384 | Always available, no API key |\n\nThe local fallback downloads ~90MB on first use. All providers implement the same interface — switching is transparent.\n\n---\n\n## CLI Reference\n\nBoth `memwright` and `agent-memory` work as entry points:\n\n### MCP Server\n\n```bash\nmemwright mcp                          # Start MCP server (uses ~/.memwright)\nmemwright mcp --path /custom/path      # Custom store location\n```\n\n### Memory Operations\n\n```bash\nagent-memory add ./store \"User prefers Python\" --tags \"pref,coding\" --category preference --namespace default\nagent-memory recall ./store \"what language?\" --budget 4000 --namespace default\nagent-memory search ./store --category project --entity Python --namespace default --limit 20\nagent-memory list ./store --status active --category technical --namespace default\nagent-memory timeline ./store --entity Python --namespace default\nagent-memory update ./store \u003cmemory-id\u003e \"Updated content\" --tags \"new,tags\" --category technical\nagent-memory forget ./store \u003cmemory-id\u003e\n```\n\n### Maintenance\n\n```bash\nagent-memory doctor ~/.memwright       # Health check (SQLite, ChromaDB, NetworkX, Retrieval)\nagent-memory stats ./store             # Memory counts, DB size, breakdowns\nagent-memory export ./store -o backup.json\nagent-memory import ./store backup.json\nagent-memory compact ./store           # Permanently delete archived memories\nagent-memory inspect ./store           # Raw DB inspection\n```\n\n### Lifecycle Hooks (Claude Code)\n\n```bash\nmemwright hook session-start           # Inject context at session start\nmemwright hook post-tool-use           # Auto-capture tool observations\nmemwright hook stop                    # Generate session summary\n```\n\n### Benchmarks\n\n```bash\nagent-memory locomo --max-conversations 5 --verbose\nagent-memory mab --categories AR,CR --max-examples 10\n```\n\n---\n\n## Configuration\n\n### Store location\n\nDefault: `~/.memwright/`. Configurable with `--path` on any CLI command.\n\n```\n~/.memwright/\n├── memory.db        # SQLite database (core storage)\n├── config.json      # Retrieval tuning parameters\n├── graph.json       # NetworkX entity graph\n└── chroma/          # ChromaDB vector store + embeddings\n```\n\n### config.json\n\nAll fields optional. Defaults apply if the file doesn't exist:\n\n```json\n{\n  \"default_token_budget\": 16000,\n  \"min_results\": 3,\n  \"backends\": [\"sqlite\", \"chroma\", \"networkx\"],\n  \"enable_mmr\": true,\n  \"mmr_lambda\": 0.7,\n  \"fusion_mode\": \"rrf\",\n  \"confidence_gate\": 0.0,\n  \"confidence_decay_rate\": 0.001,\n  \"confidence_boost_rate\": 0.03\n}\n```\n\n| Parameter | Default | Description |\n|-----------|---------|-------------|\n| `default_token_budget` | 16000 | Max tokens returned per recall (start high, lower to tune) |\n| `min_results` | 3 | Minimum results to return |\n| `enable_mmr` | true | Maximal Marginal Relevance diversity reranking |\n| `mmr_lambda` | 0.7 | Relevance vs diversity balance (0=diverse, 1=relevant) |\n| `fusion_mode` | \"rrf\" | \"rrf\" (parameter-free) or \"graph_blend\" (weighted) |\n| `confidence_decay_rate` | 0.001 | Score penalty per hour since last access |\n| `confidence_boost_rate` | 0.03 | Score boost per access count |\n| `confidence_gate` | 0.0 | Minimum confidence threshold to include in results |\n\n### Environment Variables\n\n| Variable | Purpose |\n|----------|---------|\n| `MEMWRIGHT_PATH` | Default store path |\n| `MEMWRIGHT_URL` | Remote API URL (distributed mode) |\n| `MEMWRIGHT_NAMESPACE` | Default namespace |\n| `MEMWRIGHT_TOKEN_BUDGET` | Default token budget |\n| `MEMWRIGHT_SESSION_ID` | Session ID for provenance tracking |\n\n---\n\n## Testing\n\n### Running Tests\n\n```bash\n# All unit tests — no Docker, no API keys\npoetry run pytest tests/ -v\n\n# With coverage\npoetry run pytest tests/ -v --cov=agent_memory --cov-report=term-missing\n\n# Live integration tests (need credentials)\nNEON_DATABASE_URL='postgresql://...' poetry run pytest tests/test_postgres_live.py -v\nAZURE_COSMOS_ENDPOINT='https://...' poetry run pytest tests/test_azure_live.py -v\n```\n\n### Test Coverage\n\n- **607 unit tests** covering all backends, retrieval, config, embeddings, and CLI\n- **14 live integration tests** per cloud backend (Neon, Azure, ArangoDB)\n- **Mock tests** for every cloud backend — no cloud account needed\n- All unit tests run without Docker or API keys\n\n---\n\n## Benchmarks\n\n### Latency (P50 recall — the core operation)\n\n| Backend | Stack | P50 | P95 | P99 |\n|---|---|---|---|---|\n| **PG + pgvector + AGE (Docker)** | PostgreSQL 16 + pgvector + Apache AGE | **1.4ms** | **5.5ms** | **39ms** |\n| SQLite + ChromaDB + NetworkX (local) | SQLite 3 + ChromaDB 1.x + NetworkX 3 | 9.1ms | 31ms | 75ms |\n| ArangoDB (Docker) | ArangoDB 3.12 (doc + vector + graph) | 40ms | 57ms | 68ms |\n| GCP Cloud Run (us-central1) | Starlette + Uvicorn → ArangoDB Oasis | 156ms | 245ms | 271ms |\n| Azure Container Apps (eastus) | Starlette + Uvicorn → ArangoDB Oasis | 293ms | 466ms | 480ms |\n| AWS App Runner (us-west-2) | Starlette + Uvicorn → ArangoDB Oasis | 621ms | 792ms | 813ms |\n\n### vs. Competitors (recall P50)\n\n| System | Stack | P50 | Notes |\n|---|---|---|---|\n| **Memwright (PG Docker)** | PG 16 + pgvector + AGE | **1.4ms** | Full 3-layer pipeline, 81.2% LOCOMO |\n| Ruflo | In-process HNSW | 2-3ms | Vector lookup only, not full retrieval |\n| **Memwright (local)** | SQLite + ChromaDB + NX | **9.1ms** | Zero-config, no Docker, no API keys |\n| **Memwright (GCP Cloud Run)** | Starlette → ArangoDB Oasis | **156ms** | Full cloud API, scale-to-zero |\n| Mem0 | Cloud + LLM judge | 200ms | LLM in retrieval path |\n| Zep | Neo4j + embeddings | \u003c200ms | P95 ~632ms under concurrency |\n| Mem0 Graph | Cloud + LLM + graph | 660ms | Graph variant, much slower |\n\nFull results with add/search latency: [docs/LATENCY_BENCHMARKS.md](docs/LATENCY_BENCHMARKS.md)\n\n### LOCOMO (Long Conversation Memory)\n\n| System | Accuracy |\n|--------|----------|\n| MemMachine | 84.9% |\n| **Memwright** | **81.2%** |\n| Zep | ~75% |\n| Letta | 74.0% |\n| Mem0 (Graph) | 66.9% |\n| OpenAI Memory | 52.9% |\n\n*Scores are self-reported across vendors. [Methodology is disputed](https://blog.getzep.com/lies-damn-lies-statistics-is-mem0-really-sota-in-agent-memory/).*\n\nRetrieval is fully local — tag matching, graph traversal, vector search with RRF fusion. No LLM re-ranking. Only benchmark answer synthesis uses an LLM.\n\n---\n\n## Compatibility\n\n### MCP Clients\n\n| Client | Config File |\n|--------|-------------|\n| Claude Code | `.mcp.json` (project) or `~/.claude/.mcp.json` (global) |\n| Cursor | `.cursor/mcp.json` |\n| Windsurf | MCP config in settings |\n| Any MCP client | Standard MCP stdio transport |\n\nSame `memwright mcp` command. Same zero-config setup.\n\n### Python\n\n- Python 3.10, 3.11, 3.12, 3.13, 3.14\n\n---\n\n## Uninstall\n\n### 1. Remove MCP server config\n\nDelete the `memory` entry from `~/.claude/.mcp.json` (global) or `.mcp.json` (per-project).\n\n### 2. Uninstall the package\n\n```bash\n# Match your install method:\nuv tool uninstall memwright    # if installed with uv\npipx uninstall memwright       # if installed with pipx\npip uninstall memwright        # if installed with pip\npoetry remove memwright        # if installed with poetry\n```\n\n### 3. Delete stored memories (optional)\n\n```bash\n# Export first if you want a backup\nagent-memory export ~/.memwright -o memwright-backup.json\n\n# Then delete\nrm -rf ~/.memwright\n```\n\n---\n\n## License\n\nApache 2.0\n\n---\n\n\u003csub\u003emcp-name: io.github.bolnet/memwright\u003c/sub\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbolnet%2Fagent-memory","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbolnet%2Fagent-memory","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbolnet%2Fagent-memory/lists"}