LGMay 8, 2017

Geometry of Optimization and Implicit Regularization in Deep Learning

arXiv:1705.03071v1142 citations
Originality Incremental advance
AI Analysis

This work addresses the fundamental issue of generalization in deep learning for researchers and practitioners, offering insights into implicit regularization mechanisms.

The paper tackles the problem of understanding generalization in deep learning by arguing that optimization geometry, not network size, controls generalization through implicit regularization, and demonstrates that modifying the optimization procedure can improve generalization without affecting optimization quality.

We argue that the optimization plays a crucial role in generalization of deep learning models through implicit regularization. We do this by demonstrating that generalization ability is not controlled by network size but rather by some other implicit control. We then demonstrate how changing the empirical optimization procedure can improve generalization, even if actual optimization quality is not affected. We do so by studying the geometry of the parameter space of deep networks, and devising an optimization algorithm attuned to this geometry.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes