When Slower Isn't Truer: Inverse Scaling Law of Truthfulness in Multimodal Reasoning
This work identifies a critical vulnerability of reasoning models for practitioners deploying them in multimodal applications, showing that slower reasoning can degrade truthfulness.
The paper investigates whether slower reasoning in multimodal models leads to more truthful answers, finding that it does not: slow-thinking models fabricate false details under incomplete or misleading visual inputs, while faster chat models show greater caution. The study uses a 5,000-sample dataset and reveals that slower models follow depth-first search (DFS) thinking, which is fragile in ambiguous multimodal settings.
Reasoning models have attracted increasing attention for their ability to tackle complex tasks, embodying the System II (slow thinking) paradigm in contrast to System I (fast, intuitive responses). Yet a key question remains: Does slower reasoning necessarily lead to more truthful answers? Our findings suggest otherwise. We conduct the first systematic study of the inverse scaling law in slow-thinking paradigms for multimodal reasoning. We find that when confronted with incomplete or misleading visual inputs, slow-thinking models are more prone to fabricating plausible yet false details to justify untruthful reasoning. To analyze this behavior, we construct a 5,000-sample hierarchical prompt dataset annotated by 50 human participants. The prompts progressively increase in complexity, revealing a consistent pattern: slower reasoning models tend to follow depth-first search (DFS) thinking, persistently exploring flawed premises, while faster chat models favor breadth-first search (BFS) inference, showing greater caution under uncertainty. These findings reveal a critical vulnerability of reasoning models: while effective in structured domains such as math, their DFS-style reasoning becomes fragile when confronted with ambiguous, multimodal inputs.