MLLGFeb 27, 2018

The Emergence of Spectral Universality in Deep Networks

arXiv:1802.09979v1184 citations
Originality Incremental advance
AI Analysis

This provides theoretical guidance for designing deep networks to improve learning speed, though it is incremental as it builds on prior work on spectral concentration.

The paper tackled the problem of understanding how hyperparameters affect the spectrum of deep network Jacobians at initialization, revealing that for many nonlinearities, the spectral distribution concentrates around one even at infinite depth.

Recent work has shown that tight concentration of the entire spectrum of singular values of a deep network's input-output Jacobian around one at initialization can speed up learning by orders of magnitude. Therefore, to guide important design choices, it is important to build a full theoretical understanding of the spectra of Jacobians at initialization. To this end, we leverage powerful tools from free probability theory to provide a detailed analytic understanding of how a deep network's Jacobian spectrum depends on various hyperparameters including the nonlinearity, the weight and bias distributions, and the depth. For a variety of nonlinearities, our work reveals the emergence of new universal limiting spectral distributions that remain concentrated around one even as the depth goes to infinity.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes