CLAIApr 20

PDDL-Mind: Large Language Models are Capable on Belief Reasoning with Reliable State Tracking

arXiv:2604.1781926.9h-index: 6
AI Analysis

For AI researchers working on social reasoning, this work addresses the bottleneck of state tracking in theory-of-mind tasks, but the approach is domain-specific and incremental.

LLMs underperform on theory-of-mind benchmarks due to unreliable implicit state tracking. PDDL-Mind, a neuro-symbolic framework using PDDL for explicit state representation, achieves over 5% absolute accuracy gain over SOTA on MMToM-QA, MuMA, and FanToM.

Large language models (LLMs) perform substantially below human level on existing theory-of-mind (ToM) benchmarks, even when augmented with chain-of-thought prompting or probabilistic belief updates. We argue that these failures primarily arise from unreliable implicit state tracking rather than limitations in high-level reasoning. We introduce PDDL-Mind, a neuro-symbolic framework that decouples environment state evolution from belief inference. By translating narrative descriptions into explicit states and actions expressed in Planning Domain Definition Language (PDDL), and by verifying action-induced state transitions against a predefined domain, PDDL-Mind provides LLMs with a logically consistent and explicit representation of world states for ToM tasks. Experiments on MMToM-QA, MuMA and FanToM show that PDDL-Mind achieves over 5% absolute accuracy gain over the best existing state-of-the-art method on ToM benchmark questions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes