AIMay 30

Doing What They Say, Not What They Reason: Locating the Faithfulness Gap in LLM Agents

arXiv:2606.0047672.6

Predicted impact top 46% in AI · last 90 daysOriginality Synthesis-oriented

AI Analysis

For researchers using LLMs in social simulation, this work identifies and measures a specific faithfulness gap, though it is incremental as it focuses on a controlled setting.

The paper studies whether LLM agents act on the reasoning they state (process fidelity) in a Texas Poker simulator with verifiable reference actions. It decomposes the faithfulness gap into reasoning-conclusion and conclusion-action steps, finding they behave oppositely.

Do LLM agents act on the reasoning they state? This question of process fidelity is central to using LLMs in social simulation, yet it is hard to measure where no reference for correct behavior exists. We study it in acontrolled setting, a Texas Poker simulator with a verifiable reference action for every decision by decomposing the faithfulness gap into two steps: reasoning-conclusion and conclusion-action. The two steps behave oppositely.

View on arXiv PDF

Similar