CLJun 15, 2025

QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

Wanlong Liu, Junxiao Xu, Fei Yu, Yukang Lin, Ke Ji, Wenyu Chen, Yan Xu, Yasheng Wang, Lifeng Shang, Benyou Wang

arXiv:2506.12860v119.922 citationsh-index: 9Has Code

Originality Incremental advance

AI Analysis

This addresses efficiency and robustness issues in reasoning models for tasks like mathematics, though it is incremental as it builds on existing CoT methods.

The paper tackles the problem of overthinking in Long Chain-of-Thought reasoning models, which generate redundant steps for simple questions, by proposing Question-Free Fine-Tuning (QFFT) to enable adaptive use of both Long and Short CoT patterns, resulting in a reduction of average response length by over 50% while maintaining comparable performance to Supervised Fine-Tuning.

Recent advancements in Long Chain-of-Thought (CoT) reasoning models have improved performance on complex tasks, but they suffer from overthinking, which generates redundant reasoning steps, especially for simple questions. This paper revisits the reasoning patterns of Long and Short CoT models, observing that the Short CoT patterns offer concise reasoning efficiently, while the Long CoT patterns excel in challenging scenarios where the Short CoT patterns struggle. To enable models to leverage both patterns, we propose Question-Free Fine-Tuning (QFFT), a fine-tuning approach that removes the input question during training and learns exclusively from Long CoT responses. This approach enables the model to adaptively employ both reasoning patterns: it prioritizes the Short CoT patterns and activates the Long CoT patterns only when necessary. Experiments on various mathematical datasets demonstrate that QFFT reduces average response length by more than 50\%, while achieving performance comparable to Supervised Fine-Tuning (SFT). Additionally, QFFT exhibits superior performance compared to SFT in noisy, out-of-domain, and low-resource scenarios.

View on arXiv PDF Code

Similar