Self-Anchor: Large Language Model Reasoning via Step-by-step Attention Alignment
This addresses a bottleneck in reasoning tasks for LLM users, offering a lightweight solution without retraining, though it appears incremental as an enhancement to existing prompting methods.
The paper tackles the problem of insufficient attention to critical intermediate steps in long reasoning chains for large language models, proposing Self-Anchor to align attention and improve performance, achieving state-of-the-art results across six benchmarks and reducing the performance gap between non-reasoning and specialized models.
To solve complex reasoning tasks for Large Language Models (LLMs), prompting-based methods offer a lightweight alternative to fine-tuning and reinforcement learning. However, as reasoning chains extend, critical intermediate steps and the original prompt will be buried in the context, receiving insufficient attention and leading to errors. In this paper, we propose Self-Anchor, a novel pipeline that leverages the inherent structure of reasoning to steer LLM attention. Self-Anchor decomposes reasoning trajectories into structured plans and automatically aligns the model's attention to the most relevant inference steps, allowing the model to maintain focus throughout generation. Our experiment shows that Self-Anchor outperforms SOTA prompting methods across six benchmarks. Notably, Self-Anchor significantly reduces the performance gap between ``non-reasoning'' models and specialized reasoning models, with the potential to enable most LLMs to tackle complex reasoning tasks without retraining.