Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
This addresses the problem of social reasoning in AI for applications like human-agent interaction, though it is incremental as it builds on existing Bayesian frameworks and LLM methods.
The paper tackles the challenge of applying LLM reasoning to scenarios without ground-truth answers, such as tracking mental states, by introducing thought-tracing, an inference-time algorithm that generates and weights hypotheses based on observations, leading to significant performance improvements on theory-of-mind benchmarks.
Existing LLM reasoning methods have shown impressive capabilities across various tasks, such as solving math and coding problems. However, applying these methods to scenarios without ground-truth answers or rule-based verification methods - such as tracking the mental states of an agent - remains challenging. Inspired by the sequential Monte Carlo algorithm, we introduce thought-tracing, an inference-time reasoning algorithm designed to trace the mental states of specific agents by generating hypotheses and weighting them based on observations without relying on ground-truth solutions to questions in datasets. Our algorithm is modeled after the Bayesian theory-of-mind framework, using LLMs to approximate probabilistic inference over agents' evolving mental states based on their perceptions and actions. We evaluate thought-tracing on diverse theory-of-mind benchmarks, demonstrating significant performance improvements compared to baseline LLMs. Our experiments also reveal interesting behaviors of the recent reasoning models - e.g., o3 and R1 - on theory-of-mind, highlighting the difference of social reasoning compared to other domains.