MLDIS-NNLGJun 4, 2018

Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach

arXiv:1806.01316v3177 citations
Originality Incremental advance
AI Analysis

This work provides foundational insights into the geometry of deep neural network loss landscapes, with implications for generalization and optimization, though it is incremental in extending mean field theories to Fisher information analysis.

The study reveals universal statistics of the Fisher information matrix in deep neural networks, showing that most eigenvalues are near zero while the maximum eigenvalue is huge, indicating a locally flat but strongly distorted parameter landscape. It demonstrates applications in connecting small eigenvalues to generalization ability and using the maximum eigenvalue to estimate optimal learning rates for gradient convergence.

The Fisher information matrix (FIM) is a fundamental quantity to represent the characteristics of a stochastic model, including deep neural networks (DNNs). The present study reveals novel statistics of FIM that are universal among a wide class of DNNs. To this end, we use random weights and large width limits, which enables us to utilize mean field theories. We investigate the asymptotic statistics of the FIM's eigenvalues and reveal that most of them are close to zero while the maximum eigenvalue takes a huge value. Because the landscape of the parameter space is defined by the FIM, it is locally flat in most dimensions, but strongly distorted in others. Moreover, we demonstrate the potential usage of the derived statistics in learning strategies. First, small eigenvalues that induce flatness can be connected to a norm-based capacity measure of generalization ability. Second, the maximum eigenvalue that induces the distortion enables us to quantitatively estimate an appropriately sized learning rate for gradient methods to converge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes