CLApr 22

TRACES: Tagging Reasoning Steps for Adaptive Cost-Efficient Early-Stopping

Yannis Belkhiter, Seshu Tirupathi, Giulio Zizzo, John D. Kelleher

arXiv:2604.2105791.9h-index: 12

Predicted impact top 17% in CL · last 90 daysOriginality Incremental advance

AI Analysis

For practitioners deploying LRMs, this provides a cost-efficient method to reduce inference overhead without significant accuracy loss.

TRACES tags reasoning steps in real-time to enable adaptive early stopping of language reasoning models, achieving 20-50% token reduction while maintaining comparable accuracy on five benchmarks.

The field of Language Reasoning Models (LRMs) has been very active over the past few years with advances in training and inference techniques enabling LRMs to reason longer, and more accurately. However, a growing body of studies show that LRMs are still inefficient, over-generating verification and reflection steps. Additionally, the high-level role of each reasoning step and how different step types contribute to the generation of correct answers, is largely underexplored. To address this challenge, we introduce TRACES (Tagging of the Reasoning steps enabling Adaptive Cost-Efficient early-Stopping), a lightweight framework that tags reasoning steps in real-time, and enable adaptive, cost-efficient early stopping of large-language-model inferences. Building on this framework we monitor reasoning behaviors during inferences, and we find that LRMs tend to shift their reasoning behavior after reaching a correct answer. We demonstrate that the monitoring of the specific type of steps can produce effective interpretable early stopping criteria. We evaluate the TRACES framework on three mathematical reasoning benchmarks, namely, MATH500, GSM8K, AIME and two knowledge and reasoning benchmarks, MMLU and GPQA respectively. We achieve 20 to 50% token reduction while maintaining comparable accuracy to standard generation.

View on arXiv PDF

Similar