LGCOMLMay 31, 2023

Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape

arXiv:2305.19510v312 citations
Originality Highly original
AI Analysis

This addresses the problem of understanding optimization challenges in neural networks for researchers, providing theoretical insights into why overparameterization can lead to favorable loss landscapes, though it is incremental in building on existing landscape studies.

The paper investigates the loss landscape of mildly overparameterized ReLU neural networks, showing that most activation patterns lead to parameter regions without bad local minima, and for one-dimensional data, many regions contain high-dimensional sets of global minima. Experimental results confirm a phase transition in Jacobian rank based on overparameterization levels.

We study the loss landscape of both shallow and deep, mildly overparameterized ReLU neural networks on a generic finite input dataset for the squared error loss. We show both by count and volume that most activation patterns correspond to parameter regions with no bad local minima. Furthermore, for one-dimensional input data, we show most activation regions realizable by the network contain a high dimensional set of global minima and no bad local minima. We experimentally confirm these results by finding a phase transition from most regions having full rank Jacobian to many regions having deficient rank depending on the amount of overparameterization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes