{"id":50943494,"url":"https://github.com/tfatykhov/awesome-agent-memory","last_synced_at":"2026-06-17T17:35:38.337Z","repository":{"id":360191916,"uuid":"1194694074","full_name":"tfatykhov/awesome-agent-memory","owner":"tfatykhov","description":"Curated research on memory systems for LLM agents","archived":false,"fork":false,"pushed_at":"2026-05-25T10:56:30.000Z","size":45,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-25T12:30:20.391Z","etag":null,"topics":["agent-memory","agentic-ai","awesome-list","cognitive-architecture","llm-agents","memory-systems","neuromorphic"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tfatykhov.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-03-28T17:35:02.000Z","updated_at":"2026-05-25T11:43:08.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/tfatykhov/awesome-agent-memory","commit_stats":null,"previous_names":["tfatykhov/awesome-agent-memory"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/tfatykhov/awesome-agent-memory","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tfatykhov%2Fawesome-agent-memory","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tfatykhov%2Fawesome-agent-memory/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tfatykhov%2Fawesome-agent-memory/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tfatykhov%2Fawesome-agent-memory/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tfatykhov","download_url":"https://codeload.github.com/tfatykhov/awesome-agent-memory/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tfatykhov%2Fawesome-agent-memory/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34459608,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-17T02:00:05.408Z","response_time":127,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-memory","agentic-ai","awesome-list","cognitive-architecture","llm-agents","memory-systems","neuromorphic"],"created_at":"2026-06-17T17:35:37.801Z","updated_at":"2026-06-17T17:35:38.327Z","avatar_url":"https://github.com/tfatykhov.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🧠 Awesome Agent Memory\n\n[![Awesome](https://awesome.re/badge.svg)](https://awesome.re)\n\nA curated collection of research papers on **memory systems for LLM-based agents** — covering architecture, retrieval, forgetting, consolidation, evaluation, the cognitive science that inspires it all, and the neuromorphic hardware that may one day implement it.\n\n\u003e **Why this list?** Every agent framework bolts on a vector store and calls it \"memory.\" These papers show what memory *actually* requires: admission control, consolidation loops, forgetting mechanisms, typed multi-store architectures, and retrieval strategies that go far beyond cosine similarity.\n\n**Maintained by** [@tfatykhov](https://github.com/tfatykhov) · Built alongside [Nous](https://github.com/tfatykhov/nous), a cognitive AI agent with typed multi-store memory.\n\n\n### 🚀 Start Here\n\n| If you want to... | Start with |\n|---|---|\n| **Understand the landscape** | [Memory Survey](https://arxiv.org/abs/2512.13564) (taxonomy) → [Anatomy](https://arxiv.org/abs/2602.19320) (critical analysis) |\n| **Build agent memory** | [xMemory](https://arxiv.org/abs/2602.02007) (why RAG isn't enough) → [A-MAC](https://arxiv.org/abs/2603.04549) (admission control) → [Mem0](https://arxiv.org/abs/2504.19413) (production system) |\n| **Add forgetting/consolidation** | [SleepGate](https://arxiv.org/abs/2603.14517) → [CraniMem](https://arxiv.org/abs/2603.15642) |\n| **Evaluate your memory system** | [StructMemEval](https://arxiv.org/abs/2602.11243) → [Anatomy](https://arxiv.org/abs/2602.19320) (why benchmarks are broken) |\n| **Train memory with RL** | [AgeMem](https://arxiv.org/abs/2601.01885) (tool-based ops) → [EMPO²](https://arxiv.org/abs/2602.23008) (exploration) → [MemFactory](https://arxiv.org/abs/2603.29493) (framework) |\n| **Understand memory privacy risk** | [ADAM](https://arxiv.org/abs/2604.09747) (extraction attack) → [FSFM](https://arxiv.org/abs/2604.20300) (forgetting as defense) |\n| **Unify memory ↔ skills ↔ rules** | [Experience Compression Spectrum](https://arxiv.org/abs/2604.15877) → [Externalization](https://arxiv.org/abs/2604.08224) |\n| **Explore the neuroscience** | [Neuromorphic section](#neuromorphic--bio-inspired-memory) |\n\n---\n\n## Contents\n\n- [Surveys \\\u0026 Taxonomies](#surveys--taxonomies)\n- [Memory Architectures](#memory-architectures)\n- [Memory Admission \\\u0026 Gating](#memory-admission--gating)\n- [Retrieval \\\u0026 Recall](#retrieval--recall)\n- [Forgetting \\\u0026 Consolidation](#forgetting--consolidation)\n- [RL-Based Memory Policy](#rl-based-memory-policy)\n- [Context Management](#context-management)\n- [Evaluation \\\u0026 Benchmarks](#evaluation--benchmarks)\n- [Cognitive \\\u0026 Neuroscience-Inspired](#cognitive--neuroscience-inspired)\n- [Skill \\\u0026 Procedural Memory](#skill--procedural-memory)\n- [Multi-Agent Memory](#multi-agent-memory)\n- [Memory Security \\\u0026 Privacy](#memory-security--privacy)\n- [Memory Economics](#memory-economics)\n- [Neuromorphic \\\u0026 Bio-Inspired Memory](#neuromorphic--bio-inspired-memory)\n- [Adjacent Research](#adjacent-research)\n- [Key Themes \\\u0026 Cross-Cutting Insights](#key-themes--cross-cutting-insights)\n- [Contributing](#contributing)\n\n### Verification Legend\n\n| Symbol | Meaning |\n|--------|---------|\n| ✅ | **Venue confirmed** — verified via official proceedings, OpenReview, or arXiv metadata |\n| 📄 | **Self-reported** — metrics from authors' own evaluation; exercise caution |\n| ⚠️ | **Unverified** — plausible claim but not independently confirmed |\n| 🔬 | **Editor's synthesis** — cross-paper connection identified by the maintainer, not claimed by original authors |\n\n---\n\n## Surveys \u0026 Taxonomies\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [Memory in the Age of AI Agents: A Survey](https://arxiv.org/abs/2512.13564) | 47 authors | Dec 2025 | Unified taxonomy across 3 lenses: **Forms** (token/parametric/latent), **Functions** (factual/experiential/working), **Dynamics** (formation/evolution/retrieval). Distinguishes agent memory from RAG and context engineering. Hugging Face Daily Paper #1 ⚠️ (unverifiable — HF archive page returns 404). |\n| [Anatomy of Agentic Memory](https://arxiv.org/abs/2602.19320) | Jiang, Li, Wei et al. | Feb 2026 | Structured taxonomy of Memory-Augmented Generation (MAG) systems. Critically analyzes empirical fragility: benchmark saturation, metric misalignment, backbone-dependent variance, overlooked latency costs. Evaluates 5 systems (LOCOMO, A-Mem, MemoryOS, Nemori, MAGMA). |\n| [The Landscape of Agentic RL for LLMs](https://arxiv.org/abs/2509.02547) | Guibin Zhang et al. (Oxford, Shanghai AI Lab, NUS, UIUC, UCL) | Sep 2025 | Synthesizes 500+ works. Reframes LLMs as autonomous agents using POMDPs. Covers planning, tool use, memory, reasoning, self-improvement. |\n| [From Static Templates to Dynamic Runtime Graphs](https://arxiv.org/abs/2603.22386) | Yue et al. (IBM Research) | Mar 2026 | Agentic Computation Graphs (ACGs) framework — distinguishes workflow templates, realized graphs, and execution traces. Organizes ~40 papers by when structure is determined. |\n| [Memory for Autonomous LLM Agents: Mechanisms, Evaluation, and Emerging Frontiers](https://arxiv.org/abs/2603.07670) | Pengfei Du | Mar 2026 | Write–manage–read loop formalization. 3D taxonomy: temporal scope × representational substrate × control policy. Five mechanism families: context-resident compression, retrieval-augmented stores, reflective self-improvement, hierarchical virtual context, policy-learned management. Proposes vision of a \"foundation model for memory control.\" |\n| [Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering](https://arxiv.org/abs/2604.08224) | Zhou, Chai, Chen, Guo, Shan, Song, Xu et al. | Apr 2026 | Unified review across **four externalized components**: memory stores, reusable skills, interaction protocols, runtime harness. Argues modern agent capability comes from reorganizing the runtime around the model, not from changing weights. Connects memory literature with skill-discovery and protocol-design literatures that rarely cite each other. |\n| [Experience Compression Spectrum: Unifying Memory, Skills, and Rules in LLM Agents](https://arxiv.org/abs/2604.15877) | Zhang, Wang, Cui, Qiu, Li, Zhu, He | Apr 2026 | Positions memory/skills/rules on a single compression axis (5–20× episodic, 50–500× procedural, 1000×+ declarative). Citation analysis of 1,136 references finds **\u003c1% cross-community citation** between memory and skills literatures. Identifies the **\"missing diagonal\"** — no current system supports adaptive cross-level compression. |\n| [From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms](https://arxiv.org/abs/2605.06716) | Luo, Tian, Cao, Luo et al. | May 2026 | **ACL 2026 Findings** ✅ — Maps the field's evolution from passive storage stores toward experience-centric, self-evolving memory. Companion paper-list: [Evolving-LLM-Agent-Memory-Survey](https://github.com/FeishuLuo/Evolving-LLM-Agent-Memory-Survey). |\n| [Graph-based Agent Memory: Taxonomy, Techniques, and Applications](https://arxiv.org/abs/2602.05665) | Yang, Zhou, Xiao, Dong et al. | Feb 2026 | First focused taxonomy of **graph-based** agent memory: node/edge designs, construction strategies (entity-centric vs event-centric vs hybrid), retrieval (one-hop vs multi-hop vs subgraph), and update/forget operators. Useful companion to A-MEM, HeLa-Mem, GAM. |\n\n## Memory Architectures\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [A-MEM: Agentic Memory for LLM Agents](https://arxiv.org/abs/2502.12110) | Xu, Liang, Mei, Gao, Tan, Zhang | Feb 2025 | Dynamic memory organization inspired by Zettelkasten. Creates interconnected \"memory notes\" with content, metadata, and explicit links. Agent autonomously decides when to create, update, or link memories. NeurIPS 2025 ✅. |\n| [SYNAPSE: Episodic-Semantic Memory via Spreading Activation](https://arxiv.org/abs/2601.02744) | Jiang, Chen, Pan et al. | Jan 2026 | Memory as a dynamic graph where relevance emerges from **spreading activation** rather than pre-computed vector links. Lateral inhibition and temporal decay highlight relevant sub-graphs. Triple Hybrid retrieval. |\n| [MAGMA: Multi-Graph Agentic Memory Architecture](https://arxiv.org/abs/2601.03236) | Jiang, Li, Li, Li | Jan 2026 | Multi-graph based architecture separating different memory concerns into distinct graph structures for AI agents. |\n| [Mem0: Production-Ready AI Agents with Scalable Long-Term Memory](https://arxiv.org/abs/2504.19413) | Chhikara, Khant, Aryan, Singh, Yadav | Apr 2025 | Scalable memory-centric architecture with dynamic extraction, consolidation, and retrieval. Graph-based variant for complex relational structures. 26% improvement over OpenAI memory, 91% lower latency, 90%+ token savings 📄 (self-reported). Evaluated on LOCOMO. |\n| [CraniMem: Neurocognitively Motivated Gated \u0026 Bounded Memory](https://arxiv.org/abs/2603.15642) | Mody, Panchal, Kar, Bhowmick, Karani | Mar 2026 | Cranial-inspired dual-store: bounded FIFO episodic buffer + knowledge graph (long-term). Goal-conditioned gating. Utility tagging (Importance + Surprise + Emotion). Beats Mem0 by +57.6% on noisy HotpotQA 📄 (self-reported; \"noisy\" = authors' own distractor injection, not standard benchmark). ICLR 2026 MemAgents Workshop ✅. [Code](https://github.com/PearlMody05/Cranimem) |\n| [AdaMem: Adaptive User-Centric Memory](https://arxiv.org/abs/2603.16496) | — | Mar 2026 | Working, episodic, persona, and graph memories. Key innovation: **question-conditioned retrieval planning** — resolves target participant first, builds retrieval route combining semantic + relation-aware graph expansion. SOTA on LoCoMo and PERSONAMEM. |\n| [PlugMem: A Task-Agnostic Plugin Memory Module](https://arxiv.org/abs/2603.03296) | Yang, Galley, Wang, Gao, Han, Zhai (Microsoft Research, UIUC) | Mar 2026 | Converts raw interactions into **propositional** (facts) + **prescriptive** (skills) knowledge, organized in a knowledge-centric memory graph. Knowledge — not entities or chunks — is the unit of memory access. Single plug-and-play module outperforms task-specific designs across 3 diverse benchmarks. Highest **information density**: more useful info per context token consumed 📄. |\n| [Memora: Harmonic Memory Representation](https://arxiv.org/abs/2602.03315) | Xia, Zhang, Dixit, Harimurugan, Wang, Ruhle, Sim, Bansal, Rajmohan (Microsoft Research) | Feb 2026 | Proves that **RAG and KG memory are special cases** of this unified framework. Primary abstractions index concrete memory values; \"cue anchors\" expand retrieval beyond semantic similarity. New SOTA on **both** LoCoMo AND LongMemEval 📄. |\n| [MemMA: Coordinating the Memory Cycle through Multi-Agent Reasoning](https://arxiv.org/abs/2603.18718) | Lin, Zhang, Lu, Liu, Tang, He, Zhang, Wang (Microsoft Research) | Mar 2026 | Multi-agent framework: Meta-Thinker → Memory Manager → Query Reasoner. **Backward path innovation**: synthesizes probe QA pairs, verifies memory, converts failures into repairs BEFORE finalizing. Plug-and-play — improves 3 different storage backends on LoCoMo. [Code](https://github.com/MinhuaLin/MemMA) |\n| [H-Mem: Hybrid Multi-Dimensional Memory](https://aclanthology.org/2026.eacl-long.363/) | Ye, Huang, Chen, Zhang (Rutgers) | Mar 2026 | Organizes memory across **time AND topic** dimensions simultaneously. Mimics associative + hierarchical properties of human memory. EACL 2026 ✅. |\n| [H-MEM: Hierarchical Memory for High-Efficiency Long-Term Reasoning](https://arxiv.org/abs/2507.22925) | Sun et al. | Mar 2026 | Multi-level memory storage with positional index encoding of sub-memory at each layer. Confidence-weighted retrieval — attaches memory weights to provide LLMs with uncertainty reference. Key finding: retrieval is ineffective without structured hierarchical storage. EACL 2026 ✅. |\n| [GAM: Hierarchical Graph-based Agentic Memory for LLM Agents](https://arxiv.org/abs/2604.12285) | Wu, Zhang, Lin, Xu, Xu, Chen, Zou et al. | Apr 2026 | Explicitly **decouples encoding from consolidation**: an event-progression graph captures stream updates online; integration into the topic-associative network is deferred until a semantic shift is detected. Addresses the tension between stream-based fluidity and structured retention. |\n| [HeLa-Mem: Hebbian Learning and Associative Memory for LLM Agents](https://arxiv.org/abs/2604.16839) | Zhu, Li, Zhang, Liu, Yang | Apr 2026 | **ACL 2026** ✅ — Bio-inspired dual-graph: (1) episodic graph evolves via Hebbian co-activation; (2) semantic store populated via **Hebbian Distillation** — a Reflective Agent identifies densely-connected hubs and distills them into reusable semantic knowledge. Beats prior SOTA on LoCoMo across 4 categories with fewer context tokens 📄. [Code](https://github.com/ReinerBRO/HeLa-Mem) |\n| [LightMem: Lightweight LLM Agent Memory with Small Language Models](https://arxiv.org/abs/2604.07798) | Zhang, Zhang, Chen, Huang, Zheng et al. | Apr 2026 | **ACL 2026** ✅ — SLM-driven memory with strict online/offline separation. STM/MTM/LTM tiers with two-stage retrieval (vector coarse → semantic re-rank). **~2.5 F1 over A-MEM on LoCoMo, 83ms retrieval, 581ms end-to-end** 📄. Shows that careful SLM use can replace repeated large-model memory calls. |\n| [MemMachine: Ground-Truth-Preserving Memory for Personalized AI Agents](https://arxiv.org/abs/2604.04853) | Wang, Yu, Love, Zhang, Wong, Scargall, Fan et al. | Apr 2026 | Open-source system integrating short-term, long-term episodic, and profile memory in a ground-truth-preserving pipeline. Targets multi-session degradation in standard RAG. |\n| [Omni-SimpleMem: Autoresearch-Guided Lifelong Multimodal Memory](https://arxiv.org/abs/2604.01007) | Liu, Ling, Qiu, Liu, Han, Xia, Tu et al. | Apr 2026 | Uses **autonomous research-agent search** over the design space (architecture × retrieval × prompts × data pipeline) to discover effective lifelong multimodal memory configurations. The first paper to treat memory architecture itself as a search target. |\n| [Human-Inspired Memory Architecture for LLM Agents](https://arxiv.org/abs/2605.08538) | Kerestecioglu, Robsky, Vasters, Sharma, Kesselman (Microsoft) | May 2026 | Biologically-grounded architecture with **six cognitive mechanisms**: sleep-phase consolidation, interference-based forgetting, **engram maturation**, **reconsolidation on retrieval**, entity knowledge graphs, hybrid multi-cue retrieval. Introduces a synthetic calibration methodology that derives all thresholds *without* benchmark exposure — eliminates a common eval-leakage source. First streaming M-tier LongMemEval eval (475 sessions). Dedup-based consolidation: 97.2% retention precision, 58% store reduction (+21.8 pp) on a 13K-issue VSCode dataset 📄. |\n| [MRAgent: Memory is Reconstructed, Not Retrieved — Graph Memory for LLM Agents](https://arxiv.org/abs/2606.06036) | Ji, Li, Hooi (NUS) | Jun 2026 | **ICML 2026** ✅ — Represents memory as a **Cue-Tag-Content graph** where associative tags bridge fine-grained cues to contents. An **active reconstruction** mechanism folds LLM reasoning *into* memory access — iteratively exploring and pruning retrieval paths on accumulated evidence, adapting retrieval to the reasoning context while avoiding combinatorial blowup. Up to **23%** over strong baselines on LoCoMo + LongMemEval with substantially lower token/runtime cost 📄. The cleanest articulation of the **\"reconstruction ≠ retrieval\"** paradigm. |\n| [MemPro: Agentic Memory Systems as Evolvable Programs](https://arxiv.org/abs/2606.00619) | Liu, Wang, Wu, Huang, Tao, Song, Zhou, He (ECNU) | May 2026 | Treats the **entire** memory construction-retrieval (MCR) pipeline as an **evolvable program**, not just the memory bank. Maintains a version tree of runnable memory-system implementations; an Evolving Agent selects promising versions, diagnoses recurring failures, and synthesizes improved children via failure-mode-guided edit-debug refinement. Beats static and prompt-level evolving baselines on LongMemEval / LoCoMo / HotpotQA / NarrativeQA within a few iterations, with favorable performance-cost trade-off 📄. Code available. ⚠️ Preprint; venue unconfirmed. |\n| [MemDreamer: Hierarchical Graph Memory with Agentic Retrieval for Long Video](https://arxiv.org/abs/2606.07512) | — | Jun 2026 | Extends agent memory to the **multimodal long-video** setting: a hierarchical graph memory plus agentic retrieval over hours-long video, exploiting the tree+graph structure for question answering. ⚠️ Preprint; venue unconfirmed. |\n| [H-Mem (Yu \u0026 Fang): Evolving \u0026 Retrieving Agent Memory via a Hybrid Structure](https://arxiv.org/abs/2605.15701) | Yu, Fang, Liu, Ma (CUHK-Shenzhen / Huawei) | May 2026 | **Distinct from the EACL H-Mem (Rutgers) and the Sun et al. H-MEM listed above** — a *third* system sharing the name. Hybrid **tree + graph**: a temporal-semantic tree lets short-term memory evolve into a summarizing long-term store, while a parallel knowledge graph captures entity relationships; retrieval exploits both. SOTA on QA across three agent-memory benchmarks 📄. ⚠️ Preprint; venue unconfirmed. |\n\n## Memory Admission \u0026 Gating\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [A-MAC: Adaptive Memory Admission Control](https://arxiv.org/abs/2603.04549) | Zhang et al. (Workday AI) | Sep 2025 | 5-dimension scoring: Utility (LLM call), Confidence (ROUGE-L grounding), Novelty (1 - max cosine similarity), Recency, Type Prior. Learned weights vs fixed heuristics. |\n| [ACC: Agent Cognitive Compressor](https://arxiv.org/abs/2601.11653) | Bousetouane | Jan 2026 | Bio-inspired memory controller replacing transcript replay with bounded internal state updated online. Compresses agent cognitive state without losing decision-relevant context. |\n| [MemReader: From Passive to Active Extraction for Long-Term Agent Memory](https://arxiv.org/abs/2604.07877) | Kang, Li, Chen, Tang, Xiong, Li | Apr 2026 | First **RL-trained active extraction policy** (vs passive transcription). MemReader-4B uses GRPO + ReAct to evaluate value/ambiguity/completeness, then chooses **WRITE / DEFER / RETRIEVE-CONTEXT / DISCARD** before admission. Integrated into MemOS. Addresses memory pollution from noisy dialogue and cross-turn dependencies. |\n| [Personalize-then-Store: Benchmarking \u0026 Learning Personalized Memory for Long-horizon Agents](https://arxiv.org/abs/2605.25535) | In, Kim, Park, Yoon, Park (KAIST) | May 2026 | Argues universal **static** storage policies waste budget on transient sessions while dropping critical long-horizon context. Introduces **PerMemBench** (first benchmark for *personalized* memory policies, with multi-year multi-domain personas) and **session-level storage gating** — a lightweight per-user admission filter that bypasses memory ops for transient sessions. Personalization yields large retention gains under perfect gating; accurate gating remains the open challenge 📄. ⚠️ Preprint. |\n\n## Retrieval \u0026 Recall\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [xMemory: Beyond RAG for Agent Memory](https://arxiv.org/abs/2602.02007) | Hu, Zhu, Yan, He, Gui | Feb 2026 | 4-level memory hierarchy. **Key thesis: RAG ≠ agent memory.** Submodular diversity-aware retrieval (MMR) replacing naive top-k. Uncertainty-gated adaptive expansion. Theme clustering. 44.91% retroactive reassignment rate proves flat structures fail. ⚠️ Conference status unverified. |\n| [SuperLocalMemory V3](https://arxiv.org/abs/2603.14588) | — | Mar 2026 | First **information-geometric foundations** for agent memory. Fisher information metric replaces cosine similarity. Riemannian Langevin dynamics for retrieval. Theoretical grounding for memory operations. |\n| [ExpRAG: Retrieval-Augmented LLM Agents Learning to Learn from Experience](https://arxiv.org/abs/2603.18272) | Ferraz, Deffayet, Nikoulina, Déjean, Clinchant | Mar 2026 | Trains agents to USE retrieved trajectories in-context (retrieval-augmented fine-tuning). Standard LoRA collapses on OOD tasks; ExpRAG-LoRA generalizes to held-out hard tasks. Combines experience retrieval with fine-tuning — neither alone is sufficient. |\n| [To Know is to Construct: Schema-Constrained Generation for Agent Memory (SCG-MEM)](https://arxiv.org/abs/2604.20117) | Zheng, Song, Li, Yang | Apr 2026 | Replaces dense retrieval with **schema-constrained decoding**. Piaget-inspired: assimilation (grounding into existing schemas) vs accommodation (expanding schemas with novel concepts). Solves \"structural hallucination\" — LLMs generating references to non-existent memory keys. Constructivist alternative to similarity-based recall. |\n| [SAM: State-Adaptive Memory for Long-Horizon Agents](https://arxiv.org/abs/2605.24468) | — | May 2026 | Targets retrieval of **scattered information** over long horizons — adapting memory access to evolving task state rather than fixed top-k recall. ⚠️ Preprint; venue unconfirmed. |\n\n## Forgetting \u0026 Consolidation\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [SleepGate: Sleep-Inspired Forgetting for LLMs](https://arxiv.org/abs/2603.14517) | Xie (Kennesaw State) | Mar 2026 | Learned sleep cycle over KV cache addressing **proactive interference**. Conflict-aware temporal tagger + forgetting gate + consolidation module. Reduces interference horizon from O(n) to O(log n). 99.5% retrieval accuracy at PI depth 5 vs 23% baseline 📄 (self-reported; extraordinary gap warrants independent replication). Supersession detection via binary σ flag. |\n| [SCM: Sleep-Consolidated Memory with Algorithmic Forgetting](https://arxiv.org/abs/2604.20943) | Shinde | Apr 2026 | Five components inspired by human memory: limited-capacity working memory, multi-dimensional importance tagging, **offline sleep-stage consolidation with distinct NREM and REM phases**, intentional value-based forgetting, and a computational self-model for introspection. Reports perfect 10-turn recall, **90.9% noise reduction**, sub-millisecond search 📄. Research preview. |\n| [FSFM: A Biologically-Inspired Framework for Selective Forgetting of Agent Memory](https://arxiv.org/abs/2604.20300) | Gu, Xiong, Wang, Ren, Li, Zhang, Guo et al. | Apr 2026 | Grounds selective forgetting in **hippocampal indexing/consolidation theory and the Ebbinghaus curve**. Argues that in resource-constrained environments a forgetting mechanism is as crucial as retention — and that memory security depends on the agent's ability to actively drop sensitive history. |\n\n## RL-Based Memory Policy\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [AgeMem: Agentic Memory — Learning Unified LTM and STM Management](https://arxiv.org/abs/2601.01885) | Yu, Yao, Xie, Tan, Feng, Li, Wu | Jan 2026 | Exposes store/retrieve/update/summarize/discard as **tool-based actions**. Agent learns WHEN and WHAT to do via 3-stage progressive RL + step-wise GRPO. Outperforms all heuristic-based baselines across 5 benchmarks. Key thesis: memory operations should be **learned**, not hard-coded 📄. |\n| [EMPO²: Exploratory Memory-Augmented On- and Off-Policy Optimization](https://arxiv.org/abs/2602.23008) | Liu, Kim, Luo, Li, Yang | Feb 2026 | Uses memory not just for recall but for **exploring novel states**. Hybrid on/off-policy RL — **128.6% improvement** over GRPO on ScienceWorld 📄. Adapts to new tasks with just a few memory-augmented trials, zero parameter updates. ICLR 2026 ✅. |\n| [MemFactory: Unified Inference \u0026 Training Framework for Agent Memory](https://arxiv.org/abs/2603.29493) | Guo, Li, Tang, Xiong, Li | Mar 2026 | \"LLaMA-Factory for memory agents\" — first unified modular framework. Lego-like plug-and-play memory components. Natively integrates GRPO for RL-based memory policy training. Supports Memory-R1, RMM, MemAgent paradigms out of the box. Up to 14.8% improvement over base models 📄. |\n| [Memory-R2: Fair Credit Assignment for Long-Horizon Memory-Augmented LLM Agents](https://arxiv.org/abs/2605.21768) | Yan, Bahloul, Nie, Schwarzmann, Trivisonno, Tresp, Ma | May 2026 | Identifies a fundamental flaw in GRPO for memory-RL: once rollouts write different memories, they no longer share the same effective environment, so trajectory-level group comparisons are *unfair*. Introduces **LoGo-GRPO** (Local + Global) — global keeps end-to-end long-horizon reward; local re-rollouts compare memory-op outcomes from the same intermediate state. Shared-parameter co-learning for fact extractor + memory manager. Progressive curriculum 8→16→32 sessions. |\n| [What Training Data Teaches RL Memory Agents: Curriculum Effects in Memory-Augmented QA](https://arxiv.org/abs/2605.23067) | He, Lin, Liu, Wu, Xie, Zhou, Xiao | May 2026 | Controlled study holding architecture/RL/hyperparams fixed and varying *only* curriculum across in-domain (LoCoMo), mixed (LoCoMo+LongMemEval), and OOD (LongMemEval). **Key findings:** curriculum is a fine-grained lever on *specialization*, not a uniform scaling factor; per-type differences dwarf aggregate differences (single-number benchmark comparisons systematically underreport); binary EM reward produces no signal at G=4 group size — continuous rewards needed for single-GPU regime. Practical RL-memory training playbook. [Code](https://github.com/EvaxHe/rl-memory-curriculum) |\n| [SaliMory: Orchestrating Cognitive Memory for Conversational Agents](https://arxiv.org/abs/2606.04120) | Zhang, Zhang, Jiang, …, X. L. Dong (Meta) | Jun 2026 | Trains a **single** LM to manage a cognitively-structured memory (user facts, preferences, working memory). Introduces a **hierarchical stage-wise process reward** + **reward-decomposed contrastive (GRPO-style) refinement**, giving isolated supervision to distinct ops (selective filtering, consolidation, cue-driven recall) end-to-end — sidestepping the credit-assignment bottleneck of standard RL over multi-stage pipelines. Cuts memory-attributed failures by **~⅓**, **+10%** end-to-end accuracy, more than doubles the Good-Personalization rate 📄. ⚠️ Preprint. |\n| [JAMEL: Joint Agent Memory \u0026 Exploration Learning via Novelty Signals](https://arxiv.org/abs/2606.01528) | Tian, Weng, Kong, …, Y. Li (Tsinghua / AIR) | Jun 2026 | Co-trains the **memory module and exploration policy** in a mutually-dependent loop: exploration needs memory to tell exhausted from unseen behaviors, while novelty-seeking interaction supplies annotation-free supervision (e.g., code coverage) for the latent memory. Generalizes to unseen environments; rivals a closed-source model while reducing tokens 📄. Code/model open-sourced. ⚠️ Preprint. |\n\n## Context Management\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [The Missing Memory Hierarchy: Demand Paging for LLM Context Windows](https://arxiv.org/abs/2603.09023) | Mason (UBC / Georgia Tech) | Sep 2025 | Maps OS virtual memory concepts to LLM context: physical memory = context window, virtual memory = persistent state, page table = retrieval handles, page fault = re-request evicted content. Analyzed 857 sessions, 54,170 API calls, 4.45B effective input tokens. |\n\n## Evaluation \u0026 Benchmarks\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [StructMemEval: Evaluating Memory Structure in LLM Agents](https://arxiv.org/abs/2602.11243) | Shutova, Olenina, Vinogradov, Sinitsin | Feb 2026 | Tests agents' ability to **organize** memory (not just retrieve). Tasks: transaction ledgers, to-do lists, trees. Key finding: LLMs don't spontaneously recognize when to apply memory structure — they succeed only when explicitly prompted. |\n| [MemoryAgentBench: Evaluating Memory via Incremental Multi-Turn Interactions](https://arxiv.org/abs/2507.05257) | Hu et al. (HUST) | Jul 2025 (revised Mar 2026) | Tests 4 memory competencies in realistic incremental accumulation. Key finding: no current method masters all 4 competencies simultaneously. ICLR 2026 ✅. [Code](https://github.com/HUST-AI-HYZ/MemoryAgentBench) |\n| [From Recall to Forgetting: Benchmarking Long-Term Memory for Personalized Agents (Memora + FAMA)](https://arxiv.org/abs/2604.20006) | Uddin, Shubham, Blanco, Baral, Wang (ASU) | Apr 2026 | **ACL 2026 Findings** ✅ — Introduces **Memora**, a weeks-to-months benchmark over three tasks: remembering, reasoning, recommending. Introduces **FAMA (Forgetting-Aware Memory Accuracy)** — a metric that *penalizes* reliance on obsolete or invalidated memory rather than just rewarding recall. Evaluation of 4 LLMs and 6 memory agents finds frequent reuse of invalid memories and failures to reconcile evolving knowledge. ⚠️ Not to be confused with Microsoft's *Memora: Harmonic Memory Representation* (2602.03315). |\n| [STALE: Can LLM Agents Know When Their Memories Are No Longer Valid?](https://arxiv.org/abs/2605.06527) | Chao, Bai, Sheng, Li, Sun | May 2026 | Benchmark for **Implicit Conflict** — a later observation invalidates an earlier memory *without explicit negation*, requiring inference + commonsense to detect. 400 expert-validated scenarios, 1,200 queries, contexts up to 150K tokens. Three-dimensional probing: **State Resolution**, **Premise Resistance**, **Implicit Policy Adaptation**. Best frontier LLM only 55.2% overall. Companion prototype **CUPMem** uses structured state consolidation + propagation-aware search. Complements FAMA. |\n| [MemoryArena: Benchmarking Agent Memory in Interdependent Multi-Session Agentic Tasks](https://arxiv.org/abs/2602.16313) | He, Wang, Zhi, Hu, Chen, Yin, Wu, Ouyang, Wang, Pei, McAuley, Choi, Pentland | Feb 2026 | Memory-Agent-Environment loop benchmark across **web navigation, preference-constrained planning, progressive information search, sequential formal reasoning**. Key result: agents near-saturated on LoCoMo **perform poorly** in MemoryArena, exposing a gap between long-context memorization benchmarks and *interdependent* agentic memory use. [Project](https://memoryarena.github.io) |\n\n## Cognitive \u0026 Neuroscience-Inspired\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [Episodic Memory is the Missing Piece for Long-Term LLM Agents](https://arxiv.org/abs/2502.06975) | Pink et al. (UT Austin) | Feb 2025 | Position paper arguing episodic memory — supporting single-shot learning of instance-specific contexts — is the critical missing capability. Defines 5 properties: temporal, instance-specific, single-shot, inspectable, compositional. |\n| [MAP: Modular Agentic Planner](https://www.nature.com/articles/s41467-025-63804-5) | Correa, Samwick, Gershman et al. (Microsoft Research, Harvard) | Oct 2023 / Nature Comms 2025 | Brain-inspired architecture decomposing planning into PFC-associated modules: conflict monitoring, state prediction, state evaluation, task decomposition, task coordination. [arXiv:2310.00194](https://arxiv.org/abs/2310.00194). |\n| [SCL: Structured Cognitive Loop with Governance Layer](https://arxiv.org/abs/2511.17673) | Kim | Nov 2025 | R-CCAM: Retrieval → Cognition → Control → Action → Memory. **Soft Symbolic Control** = governance layer applying symbolic constraints to probabilistic inference. Zero policy violations, complete decision traceability. |\n| [Procedural Memory Is Not All You Need](https://arxiv.org/abs/2505.03434) | Wheeler, Jeunen | May 2025 | LLMs are constrained by reliance on procedural memory (pattern-driven tasks). For \"wicked\" learning environments with shifting rules and ambiguous feedback, different memory types are required. ACM UMAP '25. |\n| [Evaluating Theory of Mind in Multi-Agent LLM Systems](https://arxiv.org/abs/2603.00142) | Kostka, Chudziak (Warsaw UT) | Sep 2025 | ToM and Internal Belief mechanisms are NOT universally beneficial — stronger models handle extra cognitive load well, but weaker models get confused. Model capability is the dominant factor. |\n\n## Skill \u0026 Procedural Memory\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [SkillRouter: Skill-Based Routing for LLM Agents](https://arxiv.org/abs/2603.22455) | Alibaba | Sep 2025 | **Critical finding:** skill BODY is the decisive routing signal (91.7% attention weight), NOT name (7.3%) or description (1.0%). Removing body causes 29-44pp degradation. BM25 on metadata alone scores 0%. Two-stage retrieve-and-rerank (1.2B params). |\n| [EvoSkill: Self-Evolving Skill Discovery](https://arxiv.org/abs/2603.02766) | Sentient, Virginia Tech | Sep 2025 | Automates skill discovery via iterative failure analysis without model fine-tuning. Three agents: Executor, Proposer, Skill-Builder. Skill-merge outperforms single runs. [Code](https://github.com/sentient-agi/EvoSkill) |\n| [Trajectory-Informed Memory Generation for Self-Improving Agent Systems](https://arxiv.org/abs/2603.10600) | Fang, Isahagian, Jayaram, Kumar, Muthusamy, Oum, Thomas (IBM Research) | Mar 2026 | Extracts **typed actionable tips** from execution trajectories: **strategy** tips (clean successes), **recovery** tips (failure-then-recovery), **optimization** tips (inefficient-but-successful). Closes the gap between episodic logs and reusable procedural knowledge — useful as a complement to EvoSkill and SkillRouter. |\n| [DeltaMem: Incremental Experience Memory for LLM Agents via Residual Trees](https://arxiv.org/abs/2606.03083) | Tan, Zhang, Cao, Li, Chen (RUC) | Jun 2026 | Stores experience as **residual deltas** across two trees — goal-conditioned task skills and scene-level environment knowledge — so related episodes share a common root instead of duplicating content. Retrieval: failure-penalized similarity scan + root-to-match chain reconstruction; autonomous consolidation distills high-frequency paths into new roots. Cuts redundancy and retrieval conflicts; beats baselines across interactive environments 📄. Code released. |\n\n## Multi-Agent Memory\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [CoMAM: Collaborative Memory for Multi-Agent Systems](https://arxiv.org/abs/2603.12631) | — | Sep 2025 | Collaborative RL framework modeling memory agents as sequential MDP with inter-agent dependencies. Group-level ranking consistency for coordinated memory operations. |\n| [DCM-Agent: Dual-Cluster Memory for Multi-Paradigm Ambiguity](https://arxiv.org/abs/2604.20183) | Zhang, Wan, Zhang, Yang, Zhang, Wei, Liu | Apr 2026 | Tackles structural ambiguity in optimization problems where one problem admits multiple conflicting modeling paradigms. Training-free dual-cluster memory keeps competing paradigm solutions separated so the agent can pick or blend at inference time. Relevant to multi-agent settings where agents disagree on framing. |\n\n## Memory Security \u0026 Privacy\n\n\u003e *April 2026 marks memory security emerging as a first-class research concern. As agent memories grow rich with user data, they become attack surfaces.*\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying](https://arxiv.org/abs/2604.09747) | Lyu, He, Wang, Hu, Li, Chen, Li, Chen | Apr 2026 | First systematic privacy attack on agent memory. Combines **data-distribution estimation** of a victim agent's memory with **entropy-guided adaptive querying** to maximize leakage. Achieves **up to 100% Attack Success Rate** on extracting stored memories 📄 — substantially outperforming prior attacks. Establishes the need for privacy-preserving memory designs as a first-class research direction. |\n| [SSGM: Stability and Safety Governed Memory for LLM Agents](https://arxiv.org/abs/2603.11768) | Lam, Li, Zhang, Zhao | Mar 2026 (rev May 2026) | Conceptual governance framework that **decouples memory evolution from execution** by enforcing consistency verification, temporal decay modeling, and dynamic access control *before* any memory consolidation. Provides a taxonomy of memory corruption risks: **topology-induced knowledge leakage**, **semantic drift via iterative summarization**, and consolidation hazards. Complementary defensive framing to ADAM. |\n\n## Memory Economics\n\n| Paper | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [Beyond the Context Window: Cost-Performance Analysis](https://arxiv.org/abs/2603.04814) | Pollertlam, Kornsuwannawit | Sep 2025 | Compares Mem0-style fact-based memory vs long-context LLMs on LongMemEval, LoCoMo, PersonaMemv2. **Break-even: memory system becomes cheaper after ~10 interaction turns at 100K context.** Long-context wins on factual recall but memory is competitive on reasoning. |\n| [Agent Memory: Characterization and System Implications of Stateful Long-Horizon Workloads](https://arxiv.org/abs/2606.06448) | Omri, Gan, Broveak, Geens, He, Pentland, Verhelst, Weissman, Tambe (Stanford / MIT / KU Leuven) | Jun 2026 | The **first systems-level characterization** of agent memory: (1) a system-oriented taxonomy along **four axes**; (2) a **phase-aware profiling harness** attributing cost to construction, retrieval, and generation; (3) characterizes 10 representative systems across two benchmark suites, showing how design choices shift cost across write vs read paths; (4) derives **10 system recommendations** (construction scheduling, capability floors, amortization via query volume, freshness-latency tradeoffs, fleet-scale management). The production \"memory economics\" lens. 📄 ⚠️ Preprint. |\n\n## Neuromorphic \u0026 Bio-Inspired Memory\n\n\u003e *Most agent memory research ignores 50 years of neuroscience. This section bridges that gap — presenting the biological mechanisms that may underlie what LLM memory papers have independently re-discovered (forgetting, gating, dual-phase encoding), alongside the spiking neural network implementations that model them directly. Cross-references to LLM papers below are 🔬 editor's synthesis.*\n\n| Paper / Project | Authors | Date | Key Contribution |\n|-------|---------|------|------------------|\n| [A Bio-realistic Synthetic Hippocampus for Robotic Cognition](https://link.springer.com/article/10.1007/s12668-025-02229-2) | Talanov et al. | Oct 2025 | Synthetic hippocampal architecture with **dual-phase operation**: online sensorimotor encoding during wake, offline consolidation via SWR-triggered replay during sleep. SNN on neuromorphic substrates (≤1W). Goal-prioritised plasticity prevents catastrophic forgetting. Models the biological mechanisms that parallel what SleepGate and CraniMem implement in software 🔬. BioNanoScience. Open access. |\n| [The Memristive Implementation of the Hippocampus: A Hypothesis](https://link.springer.com/article/10.1007/s12668-025-02124-w) | Talanov et al. | Aug 2025 | Hardware-level hippocampal memory using stochastic polycrystalline nano-fiber mesh as memristive substrate. Key insight: the **inherent randomness** of material structure mimics probabilistic biological synaptic networks. Demonstrates resistive switching tunability for implementing dynamic memory functions — bidirectional replay, synaptic up/down-scaling, consolidation. BioNanoScience. Open access. |\n| [Simulation of Serotonin Mechanisms in NEUCOGAR Cognitive Architecture](https://www.sciencedirect.com/science/article/abs/pii/S2212683X15000663) | Talanov, Gafarov, Vallverdú et al. | 2018 | Maps neuromodulatory mechanisms to computational models: **dopamine → attention**, **serotonin → inhibition**. The \"cube of emotions\" model. Demonstrates that mammalian emotional-state control via monoamine neurotransmitters can be re-implemented computationally. Foundation for neuromodulator-gated memory admission. Procedia Computer Science. |\n| [tinyHippo](https://github.com/max-talanov/tinyHippo) | Talanov | Active | CA1 + CA3 hippocampal microcircuit simulation in NEST with Izhikevich neurons. Implements **bidirectional replay**, theta-modulated encoding/retrieval phase separation, and SWR-triggered consolidation. The biological reference implementation — validates the mechanisms that Membrain engineers and SleepGate abstracts. MIT license. |\n| [Membrain](https://github.com/tfatykhov/membrain) | Fatykhov | Active | Neuromorphic memory bridge using **FlyHash encoding** and BiCameralMemory SNN (Nengo/Voja learning). Hopfield-style attractor dynamics for pattern completion (tested up to 20% noise in PoC). Stochastic consolidation with SleepSignal. gRPC API for agent integration. The engineering abstraction layer between biological models (tinyHippo) and cognitive agents (Nous). |\n| [Vestige](https://github.com/samvallad33/vestige) | Vallad | Active | Local-first cognitive memory MCP server for coding agents. Combines FSRS-6 decay, spreading activation, contradiction inspection, active suppression, Receipt Lock, and an inspectable dashboard for agent memory. |\n\n**The Stack:** These projects form a natural hierarchy — **tinyHippo** (biological model, validates mechanisms) → **Membrain** (engineering abstraction, SNN service) → **Nous** (cognitive agent, consumes memory). The papers provide the theoretical foundation; the repos provide working implementations.\n\n**Why this matters for agent memory 🔬:** The LLM papers in this list independently converge on mechanisms that neuroscience has studied for decades. The following correspondences are **editor's synthesis** — the LLM papers don't explicitly cite these neuroscience sources, but the structural parallels are striking:\n- SleepGate's learned forgetting ↔ hippocampal SWR consolidation during sleep (Talanov 2025a)\n- CraniMem's utility gating ↔ neuromodulator-gated admission (Talanov/NEUCOGAR 2018)\n- A-MEM's dual-phase encoding ↔ online/offline hippocampal states (Talanov 2025a, tinyHippo)\n- xMemory's hierarchical consolidation ↔ cortical-hippocampal memory transfer (Talanov 2025b)\n\n## Adjacent Research\n\nThese papers aren't solely about agent memory but contribute relevant architectural ideas:\n\n| Paper | Date | Relevance |\n|-------|------|-----------|\n| [EverMemOS](https://arxiv.org/abs/2601.02163) — Liu, Bai, Chen et al. | Jan 2026 | Self-Organizing Memory Operating System for structured long-horizon reasoning. |\n| [AriGraph](https://arxiv.org/abs/2407.04363) — Anokhin, Radionov et al. | IJCAI 2025 ✅ | Knowledge Graph World Models with episodic + semantic memory. [Proceedings](https://www.ijcai.org/proceedings/2025/) |\n| [Memory Matters More Than You Think](https://arxiv.org/abs/2601.04726) — Zhou, Li et al. (Renmin U.) | Jan 2026 | Event-centric memory with Logic Map for agent searching and reasoning. |\n| [The Consolidation Problem in Agent Memory](https://hindsight.vectorize.io/blog/2026/05/21/agent-memory-consolidation) — Vectorize (Hindsight) | May 2026 | Industry framing of consolidation as a policy layer with four levers — importance, merge, decay, eviction — benchmarked across Mem0, Zep, Letta, LangChain, Hindsight. |\n| [Introducing STATE-Bench](https://opensource.microsoft.com/blog/2026/05/19/introducing-state-bench-a-benchmark-for-ai-agent-memory) — Microsoft ([repo](https://github.com/microsoft/STATE-Bench)) | May 2026 | Open-source, memory-agnostic benchmark measuring whether agents *improve with experience* on stateful enterprise tasks (travel, support, shopping) — not just recall. |\n| [State of AI Agent Memory 2026](https://mem0.ai/blog/state-of-ai-agent-memory-2026) — Mem0 | 2026 | Industry survey of benchmarks, architectures, and six open problems (temporal abstraction, cross-session structure/identity, app-level eval, privacy/consent, staleness). |\n\n---\n\n## Key Themes \u0026 Cross-Cutting Insights\n\nThese papers, taken together, converge on several architectural principles:\n\n### 1. 🚪 Gated Admission — Not Everything Deserves to be Remembered\n**Papers:** A-MAC, CraniMem, SleepGate, ACC\n\nThe strongest systems **filter before storing**. CraniMem uses goal-conditioned cosine gating; A-MAC scores across 5 dimensions (utility, confidence, novelty, recency, type). The alternative — storing everything and retrieving selectively — leads to noise accumulation and proactive interference.\n\n### 2. 📦 Typed Multi-Store — One Vector Store Is Not Memory\n**Papers:** xMemory, SYNAPSE, AdaMem, CraniMem, Mem0, Episodic Memory\n\nEvery serious architecture separates memory by type (episodic, semantic, procedural, working). xMemory's 4-level hierarchy shows **44.91% of memories get retroactively reassigned** — proving flat structures fundamentally fail. The human brain doesn't use one store; neither should your agent.\n\n### 3. 🔄 Consolidation Loops — Memory Must Evolve\n**Papers:** SleepGate, CraniMem, SYNAPSE, A-MEM\n\nRaw episodic traces should consolidate into durable semantic knowledge over time. SleepGate's sleep cycles, CraniMem's replay-based promotion, and SYNAPSE's spreading activation all implement variants of this. Without consolidation, memory becomes a write-only log.\n\n### 4. 🗑️ Forgetting as a Feature — Not a Bug\n**Papers:** SleepGate, Procedural Memory, xMemory\n\nSleepGate reduces proactive interference from O(n) to O(log n) through learned forgetting. This is the most under-implemented capability in agent frameworks — most systems only add memories, never remove or decay them. Forgetting is what keeps memory **useful** rather than just large.\n\n### 5. 📊 RAG ≠ Agent Memory\n**Papers:** xMemory, Memory Survey, Anatomy of Agentic Memory\n\nThe clearest thesis across this literature: retrieval-augmented generation is a **retrieval strategy**, not a memory system. Agent memory requires admission, evolution, consolidation, forgetting, typed storage, and context-aware retrieval. RAG addresses only the last item.\n\n### 6. 🧪 Evaluation Is Broken\n**Papers:** Anatomy of Agentic Memory, StructMemEval\n\nCurrent benchmarks are saturated, metrics misalign with semantic utility, and results vary wildly by backbone model. StructMemEval reveals agents can't organize memory autonomously. The field needs better evaluation before it can claim progress.\n\n---\n\n### 7. 🎮 RL Is Replacing Heuristic Memory Management\n**Papers:** AgeMem, EMPO², MemFactory, Memory-R1\n\nThe dominant shift in 2026: from hand-coded admission/retrieval rules to **reinforcement-learned memory policies**. AgeMem exposes all memory operations as tool-based actions and learns when to use them via GRPO. EMPO² takes this further by using memory for exploration, not just recall. MemFactory provides the infrastructure to train these policies. This is the most consequential paradigm shift since the field recognized that RAG ≠ memory.\n\n### 8. 🔒 Memory Privacy Becomes a First-Class Concern\n**Papers:** ADAM, FSFM\n\nApril 2026 surfaced the first systematic memory-extraction attack (ADAM, up to 100% ASR) — and the first forgetting frameworks (FSFM) that explicitly frame *selective forgetting* as a privacy primitive, not just a retention/efficiency one. Expect memory threat-modeling, audit interfaces, and unlearning APIs to follow.\n\n### 9. 🪜 Memory ↔ Skills ↔ Rules as a Compression Spectrum\n**Papers:** Experience Compression Spectrum, Externalization in LLM Agents\n\nThe memory and skill-discovery communities have been solving the same problem (extract reusable knowledge from interaction traces) without citing each other — cross-community citation rate is **below 1%**. The Compression Spectrum reframes both as points on a single axis (5–20× → 50–500× → 1000×+ compression), and identifies the **missing diagonal**: no current system supports adaptive cross-level compression.\n\n### 10. 🧱 Active, Constructive Memory Operations\n**Papers:** MemReader (active extraction), SCG-MEM (schema-constrained generation), HeLa-Mem (Hebbian distillation)\n\nTwo paradigm shifts in April 2026: (1) **MemReader** moves extraction from passive transcription to an RL-learned WRITE/DEFER/RETRIEVE-CONTEXT/DISCARD policy. (2) **SCG-MEM** replaces dense retrieval with schema-constrained *decoding* (Piaget-style assimilation vs accommodation). Together they suggest the future of memory is generative-constructive, not retrieve-and-paste.\n\n### 11. 🧩 Reconstruction ≠ Retrieval — Memory Access as Active Reasoning\n**Papers:** MRAgent, MemPro, DeltaMem\n\nWave 6's headline shift: stop treating memory access as a static retrieve-then-reason step. MRAgent folds LLM reasoning *into* graph traversal (Cue-Tag-Content + active reconstruction), iteratively pruning paths on accumulated evidence. MemPro generalizes the idea to the whole pipeline — the memory *system itself* becomes an evolvable program that rewrites its own construction/retrieval logic. DeltaMem reconstructs each experience on demand by composing a root-to-leaf residual chain rather than storing flat copies. Memory is increasingly **constructed at access time**, not fetched.\n\n### 12. 📉 Memory Economics Gets Measured\n**Papers:** Agent Memory Characterization, Beyond the Context Window, Personalize-then-Store\n\nFor two years memory papers reported accuracy; few reported cost. Wave 6 changes that. The first systems characterization (Omri et al.) builds a phase-aware cost profiler and a 4-axis taxonomy, deriving 10 deployment recommendations across the write and read paths. Personalize-then-Store reframes admission as a **budget** problem — gating transient sessions to spend the memory budget where it pays off. The question is shifting from \"does memory help?\" to **\"what does memory cost, and where does it amortize?\"**\n\n## Open Questions\n\n- **How should admission gates interact with consolidation loops?** (A-MAC + SleepGate integration)\n- **What's the right forgetting curve for different memory types?** (No paper addresses type-specific decay)\n- **Can skill routing be unified with memory retrieval?** (SkillRouter + xMemory convergence)\n- **What's the minimum viable memory architecture?** (Most papers add complexity — none simplify)\n- **How should backward verification loops (MemMA probe-verify-repair) integrate with forward memory policy?** (MemMA + AgeMem convergence)\n- **Can RL-learned memory policies transfer across domains?** (EMPO² shows OOD promise — but limited evaluation)\n- **Can constructivist memory (schema-bound decoding, SCG-MEM) replace similarity-based retrieval?**\n- **Can the \"missing diagonal\" be filled — adaptive cross-level compression between episodic/procedural/declarative?**\n- **How robust is agent memory against systematic extraction attacks (ADAM-class threats)?**\n- **What is the right interface between active extraction (MemReader) and admission gating (A-MAC)?**\n- **Does Hebbian distillation (HeLa-Mem) outperform LLM-driven semantic abstraction at scale?**\n- **Can access-time reconstruction (MRAgent) scale without latency blowup versus precomputed retrieval?**\n- **What is the right cost model for agent memory at fleet scale, and when does a memory system amortize its write cost? (Agent Memory Characterization)**\n\n---\n\n## Contributing\n\nThis list grows as the field grows. To contribute:\n\n1. Open an issue or PR with the paper title, arXiv/DOI link, authors, date, and a 1-2 sentence summary of the key contribution\n2. Suggest which section it belongs in (or propose a new one)\n3. Bonus: note how it relates to or challenges existing papers in the list\n\nPapers should be peer-reviewed, at reputable workshops (e.g., ICLR MemAgents), or highly cited preprints. We prioritize papers with **novel architectural ideas** over incremental benchmark improvements.\n\n---\n\n## License\n\n[![CC0](https://licensebuttons.net/p/zero/1.0/88x31.png)](https://creativecommons.org/publicdomain/zero/1.0/)\n\nThis list is released under CC0. The papers themselves retain their original licenses.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftfatykhov%2Fawesome-agent-memory","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftfatykhov%2Fawesome-agent-memory","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftfatykhov%2Fawesome-agent-memory/lists"}