LGMLOct 29, 2024

Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks

arXiv:2410.22069v411 citationsh-index: 17ICLR
Originality Incremental advance
AI Analysis

This work addresses the theoretical understanding of optimization dynamics in neural networks, which is incremental but relevant for researchers in machine learning optimization.

The paper investigates the implicit bias of steepest descent algorithms in deep homogeneous neural networks, showing that an algorithm-dependent geometric margin increases after perfect training accuracy and that limit points correspond to KKT points of margin-maximization problems, with experimental connections to adaptive methods like Adam and Shampoo.

We study the implicit bias of the general family of steepest descent algorithms with infinitesimal learning rate in deep homogeneous neural networks. We show that: (a) an algorithm-dependent geometric margin starts increasing once the networks reach perfect training accuracy, and (b) any limit point of the training trajectory corresponds to a KKT point of the corresponding margin-maximization problem. We experimentally zoom into the trajectories of neural networks optimized with various steepest descent algorithms, highlighting connections to the implicit bias of popular adaptive methods (Adam and Shampoo).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes