Toward a Theory of Hierarchical Memory for Language Agents

arXiv:2603.2156481.7h-index: 4
AI Analysis

This provides a theoretical framework for researchers and developers working on hierarchical memory in long-context and agentic systems, though it is incremental as it formalizes existing practices.

The paper tackles the lack of a shared formalism for comparing hierarchical memory designs in language agents, proposing a unifying theory with three operators and applying it to analyze eleven existing systems.

Many recent long-context and agentic systems address context-length limitations by adding hierarchical memory: they extract atomic units from raw data, build multi-level representatives by grouping and compression, and traverse this structure to retrieve content under a token budget. Despite recurring implementations, there is no shared formalism for comparing design choices. We propose a unifying theory in terms of three operators. Extraction ($α$) maps raw data to atomic information units; coarsening ($C = (π, ρ)$) partitions units and assigns a representative to each group; and traversal ($τ$) selects which units to include in context given a query and budget. We identify a self-sufficiency spectrum for the representative function $ρ$ and show how it constrains viable retrieval strategies (a coarsening-traversal coupling). Finally, we instantiate the decomposition on eleven existing systems spanning document hierarchies, conversational memory, and agent execution traces, showcasing its generality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes