Understanding and Controlling Repetition Neurons and Induction Heads in In-Context Learning
This work addresses the problem of controlling repetitive behavior in LLMs for researchers and practitioners, but it is incremental as it builds on prior studies of attention mechanisms.
The paper investigates how repetition neurons in large language models affect in-context learning performance, finding that their impact varies by layer depth and proposing strategies to reduce repetitive outputs without harming ICL capabilities.
This paper investigates the relationship between large language models' (LLMs) ability to recognize repetitive input patterns and their performance on in-context learning (ICL). In contrast to prior work that has primarily focused on attention heads, we examine this relationship from the perspective of skill neurons, specifically repetition neurons. Our experiments reveal that the impact of these neurons on ICL performance varies depending on the depth of the layer in which they reside. By comparing the effects of repetition neurons and induction heads, we further identify strategies for reducing repetitive outputs while maintaining strong ICL capabilities.