AIMar 8

Memory for Autonomous LLM Agents:Mechanisms, Evaluation, and Emerging Frontiers

arXiv:2603.07670v125 citations
Predicted impact top 82% in AI · last 90 daysOriginality Synthesis-oriented
AI Analysis

This paper provides a comprehensive review and taxonomy of memory systems for LLM agents, which is valuable for researchers and developers building more adaptive and persistent AI systems.

This survey paper examines memory mechanisms for autonomous LLM agents, which are crucial for overcoming context window limitations. It formalizes agent memory as a write-manage-read loop and categorizes existing methods into five families, highlighting a shift from static recall benchmarks to multi-session agentic tests.

Large language model (LLM) agents increasingly operate in settings where a single context window is far too small to capture what has happened, what was learned, and what should not be repeated. Memory -- the ability to persist, organize, and selectively recall information across interactions -- is what turns a stateless text generator into a genuinely adaptive agent. This survey offers a structured account of how memory is designed, implemented, and evaluated in modern LLM-based agents, covering work from 2022 through early 2026. We formalize agent memory as a \emph{write--manage--read} loop tightly coupled with perception and action, then introduce a three-dimensional taxonomy spanning temporal scope, representational substrate, and control policy. Five mechanism families are examined in depth: context-resident compression, retrieval-augmented stores, reflective self-improvement, hierarchical virtual context, and policy-learned management. On the evaluation side, we trace the shift from static recall benchmarks to multi-session agentic tests that interleave memory with decision-making, analyzing four recent benchmarks that expose stubborn gaps in current systems. We also survey applications where memory is the differentiating factor -- personal assistants, coding agents, open-world games, scientific reasoning, and multi-agent teamwork -- and address the engineering realities of write-path filtering, contradiction handling, latency budgets, and privacy governance. The paper closes with open challenges: continual consolidation, causally grounded retrieval, trustworthy reflection, learned forgetting, and multimodal embodied memory.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes