CL AIDec 14, 2024

Rethinking Chain-of-Thought from the Perspective of Self-Training

Zongqian Wu, Baoduo Xu, Ruochen Cui, Mengmeng Zhan, Xiaofeng Zhu, Lei Feng

arXiv:2412.10827v47.210 citationsh-index: 9Has CodeICML

Originality Incremental advance

AI Analysis

This work addresses reasoning inefficiencies in large language models, representing an incremental improvement over existing chain-of-thought methods.

The paper tackled the problem of improving chain-of-thought reasoning in LLMs by proposing a novel framework that integrates task-specific prompts and adaptive iterations to address issues like over-reasoning, achieving significant performance and computational efficiency gains.

Chain-of-thought (CoT) reasoning has emerged as an effective approach for activating latent capabilities in LLMs. Interestingly, we observe that both CoT reasoning and self-training share the core objective: iteratively leveraging model-generated information to progressively reduce prediction uncertainty. Building on this insight, we propose a novel CoT framework to improve reasoning performance. Our framework integrates two key components: (i) a task-specific prompt module that optimizes the initial reasoning process, and (ii) an adaptive reasoning iteration module that dynamically refines the reasoning process and addresses the limitations of previous CoT approaches, \ie over-reasoning and high similarity between consecutive reasoning iterations. Extensive experiments demonstrate that the proposed method achieves significant advantages in both performance and computational efficiency.

View on arXiv PDF Code

Similar