LGMATH-PHMLFeb 12, 2021

Appearance of Random Matrix Theory in Deep Learning

arXiv:2102.06740v313 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of understanding loss surfaces in deep learning for researchers, offering insights into Hessian properties and optimization challenges, though it is incremental in building on existing Random Matrix Theory applications.

The paper discovered that the local spectral statistics of loss surface Hessians in neural networks align with Gaussian Orthogonal Ensemble statistics, revealing a new connection to Random Matrix Theory. It also found that the exponential difficulty of finding the global minimum impacts achieving state-of-the-art performance.

We investigate the local spectral statistics of the loss surface Hessians of artificial neural networks, where we discover excellent agreement with Gaussian Orthogonal Ensemble statistics across several network architectures and datasets. These results shed new light on the applicability of Random Matrix Theory to modelling neural networks and suggest a previously unrecognised role for it in the study of loss surfaces in deep learning. Inspired by these observations, we propose a novel model for the true loss surfaces of neural networks, consistent with our observations, which allows for Hessian spectral densities with rank degeneracy and outliers, extensively observed in practice, and predicts a growing independence of loss gradients as a function of distance in weight-space. We further investigate the importance of the true loss surface in neural networks and find, in contrast to previous work, that the exponential hardness of locating the global minimum has practical consequences for achieving state of the art performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes