CLOct 17, 2024

Repetition Neurons: How Do Language Models Produce Repetitions?

arXiv:2410.13497v219 citationsh-index: 5
Originality Synthesis-oriented
AI Analysis

This addresses the problem of repetitive text generation in language models for AI researchers, but it is incremental as it focuses on analyzing existing models without proposing a new solution.

The paper identifies 'repetition neurons' in pre-trained language models that become more active during text repetition, suggesting they treat repetition as a copying task akin to in-context learning, based on analysis across three English and one Japanese models.

This paper introduces repetition neurons, regarded as skill neurons responsible for the repetition problem in text generation tasks. These neurons are progressively activated more strongly as repetition continues, indicating that they perceive repetition as a task to copy the previous context repeatedly, similar to in-context learning. We identify these repetition neurons by comparing activation values before and after the onset of repetition in texts generated by recent pre-trained language models. We analyze the repetition neurons in three English and one Japanese pre-trained language models and observe similar patterns across them.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes