LGAINov 30, 2021

Embedding Principle: a hierarchical structure of loss landscape of deep neural networks

arXiv:2111.15527v126 citations
Originality Highly original
AI Analysis

This work provides a theoretical foundation for understanding optimization in deep learning, addressing why wide neural networks are easier to train, which is incremental but clarifies a key bottleneck.

The authors proved the Embedding Principle, showing that the loss landscape of a deep neural network contains all critical points of narrower networks, and established that wide networks are dominated by strict-saddle points, which explains their easy optimization in practice.

We prove a general Embedding Principle of loss landscape of deep neural networks (NNs) that unravels a hierarchical structure of the loss landscape of NNs, i.e., loss landscape of an NN contains all critical points of all the narrower NNs. This result is obtained by constructing a class of critical embeddings which map any critical point of a narrower NN to a critical point of the target NN with the same output function. By discovering a wide class of general compatible critical embeddings, we provide a gross estimate of the dimension of critical submanifolds embedded from critical points of narrower NNs. We further prove an irreversiblility property of any critical embedding that the number of negative/zero/positive eigenvalues of the Hessian matrix of a critical point may increase but never decrease as an NN becomes wider through the embedding. Using a special realization of general compatible critical embedding, we prove a stringent necessary condition for being a "truly-bad" critical point that never becomes a strict-saddle point through any critical embedding. This result implies the commonplace of strict-saddle points in wide NNs, which may be an important reason underlying the easy optimization of wide NNs widely observed in practice.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes