LGSep 5, 2023

Explaining grokking through circuit efficiency

Berkeley
arXiv:2309.02390v198 citationsh-index: 20
Originality Highly original
AI Analysis

This addresses a fundamental problem in understanding neural network generalization for researchers in machine learning, offering a mechanistic explanation for grokking.

The paper tackles the puzzle of grokking in neural networks, where networks transition from memorization to generalization after extended training, and proposes that this occurs due to differences in circuit efficiency, with generalizing solutions being slower but more efficient. It confirms predictions including novel behaviors like ungrokking and semi-grokking, demonstrating these phenomena experimentally.

One of the most surprising puzzles in neural network generalisation is grokking: a network with perfect training accuracy but poor generalisation will, upon further training, transition to perfect generalisation. We propose that grokking occurs when the task admits a generalising solution and a memorising solution, where the generalising solution is slower to learn but more efficient, producing larger logits with the same parameter norm. We hypothesise that memorising circuits become more inefficient with larger training datasets while generalising circuits do not, suggesting there is a critical dataset size at which memorisation and generalisation are equally efficient. We make and confirm four novel predictions about grokking, providing significant evidence in favour of our explanation. Most strikingly, we demonstrate two novel and surprising behaviours: ungrokking, in which a network regresses from perfect to low test accuracy, and semi-grokking, in which a network shows delayed generalisation to partial rather than perfect test accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes