AICLFeb 28, 2025

Re-evaluating Theory of Mind evaluation in large language models

arXiv:2502.21098v121 citationsh-index: 6Philos trans R Soc Lond Ser B Biological sci
Originality Synthesis-oriented
AI Analysis

This work addresses confusion in AI and cognitive science about evaluating Theory of Mind in models, but it is incremental as it critiques existing methods without new empirical results.

The paper tackles the mixed evidence on whether large language models possess Theory of Mind by re-evaluating evaluations, arguing that disagreement stems from unclear expectations about matching human behaviors versus underlying computations.

The question of whether large language models (LLMs) possess Theory of Mind (ToM) -- often defined as the ability to reason about others' mental states -- has sparked significant scientific and public interest. However, the evidence as to whether LLMs possess ToM is mixed, and the recent growth in evaluations has not resulted in a convergence. Here, we take inspiration from cognitive science to re-evaluate the state of ToM evaluation in LLMs. We argue that a major reason for the disagreement on whether LLMs have ToM is a lack of clarity on whether models should be expected to match human behaviors, or the computations underlying those behaviors. We also highlight ways in which current evaluations may be deviating from "pure" measurements of ToM abilities, which also contributes to the confusion. We conclude by discussing several directions for future research, including the relationship between ToM and pragmatic communication, which could advance our understanding of artificial systems as well as human cognition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes