LG MLOct 29, 2024

Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks

Nikolaos Tsilivis, Eitan Gronich, Julia Kempe, Gal Vardi

arXiv:2410.22069v417.011 citationsh-index: 17ICLR

Originality Incremental advance

AI Analysis

This work addresses the theoretical understanding of optimization dynamics in neural networks, which is incremental but relevant for researchers in machine learning optimization.

The paper investigates the implicit bias of steepest descent algorithms in deep homogeneous neural networks, showing that an algorithm-dependent geometric margin increases after perfect training accuracy and that limit points correspond to KKT points of margin-maximization problems, with experimental connections to adaptive methods like Adam and Shampoo.

We study the implicit bias of the general family of steepest descent algorithms with infinitesimal learning rate in deep homogeneous neural networks. We show that: (a) an algorithm-dependent geometric margin starts increasing once the networks reach perfect training accuracy, and (b) any limit point of the training trajectory corresponds to a KKT point of the corresponding margin-maximization problem. We experimentally zoom into the trajectories of neural networks optimized with various steepest descent algorithms, highlighting connections to the implicit bias of popular adaptive methods (Adam and Shampoo).

View on arXiv PDF

Similar