{"id":49535330,"url":"https://github.com/pitimon/ai-memory-patterns","last_synced_at":"2026-05-02T10:03:15.921Z","repository":{"id":346025690,"uuid":"1188298448","full_name":"pitimon/ai-memory-patterns","owner":"pitimon","description":"Production patterns for building AI agent memory systems — from the team behind MemForge","archived":false,"fork":false,"pushed_at":"2026-03-21T22:48:17.000Z","size":37,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-22T10:56:23.033Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pitimon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-21T22:09:32.000Z","updated_at":"2026-03-21T22:48:21.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/pitimon/ai-memory-patterns","commit_stats":null,"previous_names":["pitimon/ai-memory-patterns"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/pitimon/ai-memory-patterns","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pitimon%2Fai-memory-patterns","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pitimon%2Fai-memory-patterns/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pitimon%2Fai-memory-patterns/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pitimon%2Fai-memory-patterns/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pitimon","download_url":"https://codeload.github.com/pitimon/ai-memory-patterns/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pitimon%2Fai-memory-patterns/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32530178,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-02T01:12:54.858Z","status":"online","status_checked_at":"2026-05-02T02:00:05.923Z","response_time":132,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-02T10:03:05.957Z","updated_at":"2026-05-02T10:03:15.915Z","avatar_url":"https://github.com/pitimon.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# AI Memory Patterns\n\n**Production patterns for building AI agent memory systems.**\n\nThis guide shares architectural patterns, design decisions, and lessons learned from building a production AI memory system (154K+ observations, multi-tenant, 3-node cluster) that competes with commercial solutions like Mem0 and Zep.\n\n**Not** a tutorial for using existing frameworks. This is a **builder's guide** — for engineers who want to understand _why_ certain patterns work and _what breaks_ when you skip them.\n\n## Who This Is For\n\n- Engineers building custom AI memory / RAG systems\n- Teams evaluating build-vs-buy for persistent agent memory\n- Architects designing multi-tenant AI platforms\n- Anyone frustrated by \"just use vector search\" advice\n\n## Architecture Overview\n\n```\n┌─────────────────────────────────────────────────────────────────────┐\n│                        QUERY LAYER                                  │\n│                                                                     │\n│  User Query ──► Intent Detection ──► Complexity Analysis            │\n│                      │                      │                       │\n│              ┌───────┴───────┐      ┌───────┴───────┐              │\n│              │ Weight Map    │      │ Retrieval     │              │\n│              │ factual: 0.7F │      │ Params        │              │\n│              │ relat:   0.7V │      │ limit/rerank  │              │\n│              └───────┬───────┘      └───────┬───────┘              │\n│                      └──────────┬───────────┘                       │\n│                                 ▼                                   │\n│  ┌──── Signal 1 ────┐  ┌── Signal 2 ──┐  ┌──── Signal 3 ────┐     │\n│  │ FTS (keywords)   │  │ Vector (768d) │  │ Graph (concepts) │     │\n│  │ LIKE %query%     │  │ HNSW cosine   │  │ 1-2 hop traverse │     │\n│  └────────┬─────────┘  └──────┬────────┘  └────────┬─────────┘     │\n│           └──────────────┬────┴─────────────────────┘               │\n│                          ▼                                          │\n│                   RRF Fusion (k=60)                                 │\n│              combined = Wv·vec + Wf·fts + 0.15·graph               │\n│                          │                                          │\n│                          ▼                                          │\n│                 Importance-weighted ranking                         │\n│              score = similarity·0.7 + importance·0.3               │\n│                          │                                          │\n│                          ▼                                          │\n│                   Ranked Results + Hints                            │\n└─────────────────────────────────────────────────────────────────────┘\n                           │\n          ┌────────────────┼────────────────┐\n          ▼                ▼                ▼\n┌──── STORAGE ─────┐ ┌── GRAPH ──┐ ┌──── CACHE ──────┐\n│ PostgreSQL       │ │ Memgraph  │ │ Redis           │\n│ + pgvector HNSW  │ │ Cypher    │ │ TTL + Pub/Sub   │\n│ halfvec(768)     │ │ In-memory │ │ Bloom filter    │\n│ Schema-per-user  │ │           │ │ LRU embedding   │\n└──────────────────┘ └───────────┘ └─────────────────┘\n                           │\n          ┌────────────────┼────────────────┐\n          ▼                ▼                ▼\n┌── AI WORKERS ────┐ ┌── SCORING ──┐ ┌── CURATION ───┐\n│ Embedding (5s)   │ │ 5-Factor    │ │ Pin / Unpin   │\n│ Observer (bg)    │ │ Importance  │ │ Contradict    │\n│ Compression (1m) │ │ type: 30%   │ │ Drift check   │\n│ Graph sync (5m)  │ │ recency:25% │ │ Set importance│\n│ LLM: 4-provider  │ │ access: 20% │ │ Event date    │\n│  fallback chain  │ │ ref:    15% │ │               │\n│  + circuit break │ │ content:10% │ │               │\n└──────────────────┘ └─────────────┘ └───────────────┘\n```\n\n## Patterns\n\n### 1. Search Architecture\n\n| Pattern                                                         | File | Key Insight                                                                      |\n| --------------------------------------------------------------- | ---- | -------------------------------------------------------------------------------- |\n| [3-Signal Hybrid Search](patterns/01-hybrid-search.md)          | `01` | Vector-only misses exact keywords. FTS-only misses semantics. Combine with RRF.  |\n| [Intent-Based Weight Adaptation](patterns/02-intent-weights.md) | `02` | Factual queries need more FTS. Relational queries need more vector. Auto-detect. |\n| [Query Complexity Analyzer](patterns/03-query-complexity.md)    | `03` | Simple queries need 10 results. Complex queries need 50 + reranking.             |\n\n### 2. Data Architecture\n\n| Pattern                                                         | File | Key Insight                                                                   |\n| --------------------------------------------------------------- | ---- | ----------------------------------------------------------------------------- |\n| [Schema-Per-User Multi-Tenancy](patterns/04-schema-per-user.md) | `04` | Row-level filtering leaks data when you forget WHERE. Schema isolation can't. |\n| [Importance Scoring](patterns/05-importance-scoring.md)         | `05` | Not all memories matter equally. 5-factor model beats recency-only.           |\n| [Knowledge Graph Integration](patterns/06-knowledge-graph.md)   | `06` | Graph adds value for multi-hop questions. Not worth it for simple recall.     |\n\n### 3. AI Infrastructure\n\n| Pattern                                                    | File | Key Insight                                                                             |\n| ---------------------------------------------------------- | ---- | --------------------------------------------------------------------------------------- |\n| [LLM Provider Fallback Chain](patterns/07-llm-fallback.md) | `07` | Free providers first, paid fallback. Circuit breaker per provider.                      |\n| [Embedding Strategy](patterns/08-embedding-strategy.md)    | `08` | 768d is enough. Float16 quantization saves 50% storage. Don't over-engineer dimensions. |\n| [Background Worker Architecture](patterns/09-workers.md)   | `09` | 15 workers, self-healing, batch processing. Don't embed synchronously.                  |\n\n### 4. Quality \u0026 Operations\n\n| Pattern                                              | File | Key Insight                                                                 |\n| ---------------------------------------------------- | ---- | --------------------------------------------------------------------------- |\n| [Memory Curation Tools](patterns/10-curation.md)     | `10` | Pin, contradict, drift-check. Memory rots without active curation.          |\n| [Benchmarking (LoCoMo)](patterns/11-benchmarking.md) | `11` | How to evaluate memory systems fairly. Methodology matters more than score. |\n| [MCP Plugin Design](patterns/12-mcp-plugin.md)       | `12` | Workflow hints \u003e more tools. Descriptions guide LLM tool selection.         |\n\n## Anti-Patterns\n\n| Anti-Pattern            | Why It Fails                                    | Better Alternative                      |\n| ----------------------- | ----------------------------------------------- | --------------------------------------- |\n| Vector-only search      | Misses exact keywords, no project isolation     | 3-signal RRF (Pattern 1)                |\n| Synchronous embedding   | Blocks write path, slow ingestion               | Background worker queue (Pattern 9)     |\n| No importance scoring   | All memories treated equal, noise drowns signal | 5-factor model (Pattern 5)              |\n| No project filter       | Cross-project contamination in search           | Server-side SQL filter                  |\n| Huge embeddings (3072d) | 4x storage, marginal quality gain               | 768d + Float16 (Pattern 8)              |\n| No curation tools       | Memory quality degrades over time               | Pin/contradict/drift-check (Pattern 10) |\n\n## Benchmark Results\n\nEvaluated against [LoCoMo](https://github.com/snap-research/locomo) benchmark (the standard for AI memory systems):\n\n| System         | Score       | Approach                 |\n| -------------- | ----------- | ------------------------ |\n| Backboard      | 90.0%       | Gemini 2.5 Pro + custom  |\n| Zep            | 75.1%       | Temporal knowledge graph |\n| Mem0           | 62.5%       | Vector + optional graph  |\n| **Our system** | **57.3%\\*** | 3-signal RRF + FTS       |\n\n\\*Pilot on 1 conversation. Optimization roadmap targets 66-72%.\n\n## Decision Matrices\n\n### Build vs Buy\n\n| Factor           | Build (like this guide)                            | Buy (Mem0/Zep)                        |\n| ---------------- | -------------------------------------------------- | ------------------------------------- |\n| Data sovereignty | Full control                                       | Vendor-dependent                      |\n| Customization    | Unlimited                                          | API constraints                       |\n| Cost at scale    | Infrastructure only                                | $19-$249/mo + usage                   |\n| Time to market   | Weeks-months                                       | Hours-days                            |\n| Maintenance      | Your team                                          | Vendor handles                        |\n| **Best for**     | **Production platforms, compliance, custom needs** | **MVPs, startups, rapid prototyping** |\n\n### Search Strategy\n\n| Query Type          | Best Approach                         | Why                                  |\n| ------------------- | ------------------------------------- | ------------------------------------ |\n| Exact fact recall   | FTS-heavy (70% FTS, 30% vector)       | Keywords matter for facts            |\n| Semantic similarity | Vector-heavy (70% vector, 30% FTS)    | Meaning matters for concepts         |\n| Time-based          | Temporal query with date parsing      | \"yesterday\", \"last week\"             |\n| Multi-hop reasoning | Graph traversal + vector              | Connect entities across observations |\n| General question    | Balanced hybrid (50/50) + graph (15%) | Best default                         |\n\n## Technology Stack (Reference Implementation)\n\n```\nRuntime:     Bun (TypeScript)\nDatabase:    PostgreSQL 17 + pgvector (HNSW, halfvec)\nGraph:       Memgraph (Cypher, in-memory)\nCache:       Redis 7 (TTL, Pub/Sub, Bloom filter)\nEmbedding:   Gemini embedding-001 (768d) → OpenRouter fallback\nLLM:         vLLM → OpenRouter → Google AI → Ollama\nDeploy:      Docker Swarm (3-node HA) with NFS\nMonitoring:  Prometheus + Grafana + Alertmanager\nClient:      MCP plugin for Claude Code (27 tools)\n```\n\n## Contributing\n\nFound a pattern that worked (or broke) in your memory system? PRs welcome.\n\n## License\n\nMIT\n\n## Acknowledgements\n\nPatterns extracted from [MemForge](https://github.com/pitimon/memforge) — a production AI memory system serving 154K+ observations with multi-tenant architecture.\n\nInspired by the work of [Mem0](https://mem0.ai), [Zep/Graphiti](https://getzep.com), and the [LoCoMo benchmark](https://github.com/snap-research/locomo) research.\n\n---\n\n_Last verified: 2026-03-22 | Version: 1.0_\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpitimon%2Fai-memory-patterns","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpitimon%2Fai-memory-patterns","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpitimon%2Fai-memory-patterns/lists"}