PRLGMLAug 11, 2019

Almost Sure Asymptotic Freeness of Neural Network Jacobian with Orthogonal Weights

arXiv:1908.03901v4
AI Analysis

This work addresses the theoretical foundation for gradient stability in deep learning, which is crucial for training efficiency, but it is incremental as it builds on existing free probability theory.

The paper tackles the problem of understanding the Jacobian spectrum in deep neural networks to prevent gradient issues and accelerate learning, by rigorously proving almost sure asymptotic freeness of layer-wise Jacobians in the wide limit with orthogonal weight initialization.

A well-conditioned Jacobian spectrum has a vital role in preventing exploding or vanishing gradients and speeding up learning of deep neural networks. Free probability theory helps us to understand and handle the Jacobian spectrum. We rigorously show almost sure asymptotic freeness of layer-wise Jacobians of deep neural networks as the wide limit. In particular, we treat the case that weights are initialized as Haar distributed orthogonal matrices.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes