AICLDec 19, 2025

When Reasoning Meets Its Laws

arXiv:2512.17901v12 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses the suboptimal reasoning capabilities in LRMs, which is a foundational issue for AI systems, though it is incremental as it builds on existing models with a new evaluation and finetuning approach.

The paper tackles the problem of counterintuitive reasoning behaviors in Large Reasoning Models (LRMs) by proposing the Laws of Reasoning (LoRe) framework and a benchmark (LoRe-Bench) to evaluate them, finding that most models lack compositionality and showing that finetuning to enforce compute-law compositionality improves reasoning performance on multiple benchmarks.

Despite the superior performance of Large Reasoning Models (LRMs), their reasoning behaviors are often counterintuitive, leading to suboptimal reasoning capabilities. To theoretically formalize the desired reasoning behaviors, this paper presents the Laws of Reasoning (LoRe), a unified framework that characterizes intrinsic reasoning patterns in LRMs. We first propose compute law with the hypothesis that the reasoning compute should scale linearly with question complexity. Beyond compute, we extend LoRe with a supplementary accuracy law. Since the question complexity is difficult to quantify in practice, we examine these hypotheses by two properties of the laws, monotonicity and compositionality. We therefore introduce LoRe-Bench, a benchmark that systematically measures these two tractable properties for large reasoning models. Evaluation shows that most reasoning models exhibit reasonable monotonicity but lack compositionality. In response, we develop an effective finetuning approach that enforces compute-law compositionality. Extensive empirical studies demonstrate that better compliance with compute laws yields consistently improved reasoning performance on multiple benchmarks, and uncovers synergistic effects across properties and laws. Project page: https://lore-project.github.io/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes