CLAIApr 30, 2025

Language Models Do Not Have Human-Like Working Memory

Peking UTencent
arXiv:2505.10571v36 citationsh-index: 23Has Code
Originality Highly original
AI Analysis

This work identifies a key bottleneck for advancing reliable reasoning systems in AI, highlighting a fundamental limitation in LLMs that affects their ability to perform coherent reasoning tasks.

The paper tackled the problem of whether Large Language Models (LLMs) possess human-like working memory by introducing three novel tasks to isolate internal representation from external context, and found that across seventeen frontier models, they consistently exhibited irrational or contradictory behaviors, indicating an inability to retain and manipulate latent information.

While Large Language Models (LLMs) exhibit remarkable reasoning abilities, we demonstrate that they lack a fundamental aspect of human cognition: working memory. Human working memory is an active cognitive system that enables not only the temporary storage of information but also its processing and utilization, enabling coherent reasoning and decision-making. Without working memory, individuals may produce unrealistic responses, exhibit self-contradictions, and struggle with tasks that require mental reasoning. Existing evaluations using N-back or context-dependent tasks fall short as they allow LLMs to exploit external context rather than retaining the reasoning process in the latent space. We introduce three novel tasks: (1) Number Guessing, (2) Yes-No Deduction, and (3) Math Magic, designed to isolate internal representation from external context. Across seventeen frontier models spanning four major model families, we consistently observe irrational or contradictory behaviors, indicating LLMs' inability to retain and manipulate latent information. Our work establishes a new benchmark for evaluating working memory in LLMs and highlights this limitation as a key bottleneck for advancing reliable reasoning systems. Code and prompts for the experiments are available at https://github.com/penguinnnnn/LLM-Working-Memory.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes