LGOCMLMay 24, 2022

Quadratic models for understanding catapult dynamics of neural networks

arXiv:2205.11787v318 citationsh-index: 55
Originality Synthesis-oriented
AI Analysis

This work provides a tool for analyzing neural networks, but it is incremental as it builds on existing quadratic models and the known catapult phase phenomenon.

The authors tackled the problem of understanding the catapult phase in neural networks by showing that Neural Quadratic Models can exhibit this behavior and parallel neural networks in generalization, especially under large learning rates.

While neural networks can be approximated by linear models as their width increases, certain properties of wide neural networks cannot be captured by linear models. In this work we show that recently proposed Neural Quadratic Models can exhibit the "catapult phase" [Lewkowycz et al. 2020] that arises when training such models with large learning rates. We then empirically show that the behaviour of neural quadratic models parallels that of neural networks in generalization, especially in the catapult phase regime. Our analysis further demonstrates that quadratic models can be an effective tool for analysis of neural networks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes