LGAICLJun 4, 2024

Iteration Head: A Mechanistic Study of Chain-of-Thought

arXiv:2406.02128v237 citations
Originality Incremental advance
AI Analysis

This provides mechanistic insights into CoT capabilities in transformers, which is incremental for interpretability research.

The paper tackled the limited understanding of how Chain-of-Thought reasoning emerges in transformers by identifying specialized 'iteration heads' as a mechanism for iterative reasoning, showing their emergence and transferability across tasks.

Chain-of-Thought (CoT) reasoning is known to improve Large Language Models both empirically and in terms of theoretical approximation power. However, our understanding of the inner workings and conditions of apparition of CoT capabilities remains limited. This paper helps fill this gap by demonstrating how CoT reasoning emerges in transformers in a controlled and interpretable setting. In particular, we observe the appearance of a specialized attention mechanism dedicated to iterative reasoning, which we coined "iteration heads". We track both the emergence and the precise working of these iteration heads down to the attention level, and measure the transferability of the CoT skills to which they give rise between tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes