LGAIMLMay 28, 2019

On Dropout and Nuclear Norm Regularization

arXiv:1905.11887v132 citations
Originality Incremental advance
AI Analysis

This provides theoretical insights into dropout regularization for researchers in machine learning, though it is incremental as it builds on prior work on dropout and nuclear norms.

The paper formally characterizes the explicit regularizer induced by dropout in deep linear networks with squared loss, showing it includes an ℓ₂-path regularizer and has a convex envelope equal to the squared nuclear norm, and identifies global optima for high dropout rates.

We give a formal and complete characterization of the explicit regularizer induced by dropout in deep linear networks with squared loss. We show that (a) the explicit regularizer is composed of an $\ell_2$-path regularizer and other terms that are also re-scaling invariant, (b) the convex envelope of the induced regularizer is the squared nuclear norm of the network map, and (c) for a sufficiently large dropout rate, we characterize the global optima of the dropout objective. We validate our theoretical findings with empirical results.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes