AICLMay 17

Causal Intervention-Based Memory Selection for Long-Horizon LLM Agents

arXiv:2605.1764160.1Has Code
AI Analysis

For LLM agents requiring long-term memory, this work addresses the problem of selecting useful memories while suppressing irrelevant or harmful ones, offering a causal approach that improves robustness over relevance-based methods.

The paper proposes Causal Memory Intervention (CMI), a causal memory-selection technique for long-horizon LLM agents that selects memories based on their causal effect on answer quality, outperforming existing methods in balancing answer quality and robustness to misleading memory.

Long-horizon LLM agents rely on persistent memory to support interactions across sessions, yet existing memory systems often retrieve context using semantic similarity or broad history inclusion, treating retrieved memories as uniformly useful. This assumption is fragile because memories may be topically related while remaining irrelevant, stale, or misleading. We propose Causal Memory Intervention (CMI), a causal memory-selection technique that estimates how candidate memories affect the model's answer under controlled interventions, selecting memories that improve task performance while suppressing unstable, irrelevant, or harmful ones. To evaluate this setting, we introduce Causal-LoCoMo, a causally annotated benchmark derived from long conversational data, where each example contains a user request, a structured memory bank, useful memories, irrelevant distractors, and synthetic harmful memories. We compare CMI against vector, graph, reflection, summary, full-history, and no-memory baselines. Results show that CMI achieves a stronger balance between answer quality and robustness to misleading memory, suggesting that reliable long-term memory requires selecting context based on causal usefulness rather than relevance alone. The full framework, benchmark construction code, and experimental pipeline are available at https://github.com/Saksham4796/causal-memory-intervention.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes