The Scaling Properties of Implicit Deductive Reasoning in Transformers

arXiv:2605.0433053.7h-index: 17

Predicted impact top 69% in AI · last 90 daysOriginality Synthesis-oriented

AI Analysis

For researchers in reasoning and Transformers, this work provides insights into the scaling properties of implicit reasoning, though it is incremental as it confirms known limitations of implicit methods.

The paper studies how Transformers perform implicit deductive reasoning over Horn clauses, finding that deep models with bidirectional masks approach chain-of-thought performance across graph topologies and widths, but chain-of-thought is still needed for depth extrapolation.

We investigate the scaling properties of implicit deductive reasoning over Horn clauses in depth-bounded Transformers. By systematically decorrelating provability from spurious features and enforcing algorithmic alignment, we find that in sufficiently deep models with a bidirectional prefix mask, implicit reasoning approaches explicit CoT performance across graph topologies and problem widths, though CoT remains necessary for depth extrapolation.

View on arXiv PDF

Similar