LGNEOCMLFeb 27, 2017

Depth Creates No Bad Local Minima

arXiv:1702.08580v2127 citations
AI Analysis

This addresses a theoretical problem in deep learning by clarifying the role of depth in optimization landscapes, though it is incremental as it builds on prior work on linear networks.

The paper proves that depth alone, without nonlinearity, does not create bad local minima in deep linear neural networks, showing all local minima are global minima, generalizing previous results with fewer assumptions.

In deep learning, \textit{depth}, as well as \textit{nonlinearity}, create non-convex loss surfaces. Then, does depth alone create bad local minima? In this paper, we prove that without nonlinearity, depth alone does not create bad local minima, although it induces non-convex loss surface. Using this insight, we greatly simplify a recently proposed proof to show that all of the local minima of feedforward deep linear neural networks are global minima. Our theoretical results generalize previous results with fewer assumptions, and this analysis provides a method to show similar results beyond square loss in deep linear models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes