LGNEMLMay 13, 2020

Implicit Regularization in Deep Learning May Not Be Explainable by Norms

arXiv:2005.06398v2170 citations
AI Analysis

This work addresses a fundamental theoretical problem in deep learning for researchers, showing that a widely held assumption about implicit regularization is incorrect, which could shift how generalization is understood.

The paper resolves the open question of whether norms can explain implicit regularization in matrix factorization by proving that in some problems, implicit regularization drives all norms and quasi-norms to infinity, suggesting rank minimization as a more useful interpretation.

Mathematically characterizing the implicit regularization induced by gradient-based optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may apply, and a standard test-bed for studying this prospect is matrix factorization (matrix completion via linear neural networks). It is an open question whether norms can explain the implicit regularization in matrix factorization. The current paper resolves this open question in the negative, by proving that there exist natural matrix factorization problems on which the implicit regularization drives all norms (and quasi-norms) towards infinity. Our results suggest that, rather than perceiving the implicit regularization via norms, a potentially more useful interpretation is minimization of rank. We demonstrate empirically that this interpretation extends to a certain class of non-linear neural networks, and hypothesize that it may be key to explaining generalization in deep learning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes