AICLSep 27, 2025

Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking

arXiv:2509.23392v25 citationsh-index: 22Has Code
AI Analysis

This addresses efficiency issues in large reasoning models for AI applications, representing an incremental improvement over existing methods.

The paper tackles the problem of large reasoning models incurring high computational costs due to overthinking, and proposes a method that improves reasoning efficiency without sacrificing accuracy, achieving a 4.6% accuracy gain and 46.3% output length reduction on a benchmark.

Large Reasoning Models (LRMs) have achieved impressive performance on challenging tasks, yet their deep reasoning often incurs substantial computational costs. To achieve efficient reasoning, existing reinforcement learning methods still struggle to construct short reasoning path during the rollout stage, limiting effective learning. Inspired by Evidence Accumulation Models, we find that LRMs have accumulated sufficient information early in reasoning, making further reasoning steps redundant. Based on this insight, we propose Just-Enough Thinking (JET), which trains models to proactively terminate unnecessary reasoning. JET performs trajectory truncation during rollout to expose the model to short, distributionally consistent reasoning paths. Besides, it uses a quality-controlled length reward to better encourage concise reasoning while maintaining correctness. Extensive experiments demonstrate that JET significantly improves reasoning efficiency without sacrificing accuracy. Especially, DeepSeek-Distill-Qwen-1.5B achieves a 4.6% accuracy gain while reducing output length by 46.3% on the Olympiad benchmark. Our code is available in the GitHub.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes