LGJun 30, 2023

The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks

MicrosoftMIT
arXiv:2306.17844v2164 citationsh-index: 86
AI Analysis

This work highlights the variability in algorithm discovery for neural networks, which is incremental but important for researchers developing tools to understand neural network behavior.

The study investigates whether neural networks trained on algorithmic tasks rediscover known algorithms, using modular addition as a case study, and finds that small changes in hyperparameters can lead to diverse algorithms like the Clock and Pizza, revealing unexpected solution complexity.

Do neural networks, trained on well-understood algorithmic tasks, reliably rediscover known algorithms for solving those tasks? Several recent studies, on tasks ranging from group arithmetic to in-context linear regression, have suggested that the answer is yes. Using modular addition as a prototypical problem, we show that algorithm discovery in neural networks is sometimes more complex. Small changes to model hyperparameters and initializations can induce the discovery of qualitatively different algorithms from a fixed training set, and even parallel implementations of multiple such algorithms. Some networks trained to perform modular addition implement a familiar Clock algorithm; others implement a previously undescribed, less intuitive, but comprehensible procedure which we term the Pizza algorithm, or a variety of even more complex procedures. Our results show that even simple learning problems can admit a surprising diversity of solutions, motivating the development of new tools for characterizing the behavior of neural networks across their algorithmic phase space.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes