LG CO MLMay 31, 2023

Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape

Kedar Karhadkar, Michael Murray, Hanna Tseran, Guido Montúfar

arXiv:2305.19510v313.712 citations

Originality Highly original

AI Analysis

This addresses the problem of understanding optimization challenges in neural networks for researchers, providing theoretical insights into why overparameterization can lead to favorable loss landscapes, though it is incremental in building on existing landscape studies.

The paper investigates the loss landscape of mildly overparameterized ReLU neural networks, showing that most activation patterns lead to parameter regions without bad local minima, and for one-dimensional data, many regions contain high-dimensional sets of global minima. Experimental results confirm a phase transition in Jacobian rank based on overparameterization levels.

We study the loss landscape of both shallow and deep, mildly overparameterized ReLU neural networks on a generic finite input dataset for the squared error loss. We show both by count and volume that most activation patterns correspond to parameter regions with no bad local minima. Furthermore, for one-dimensional input data, we show most activation regions realizable by the network contain a high dimensional set of global minima and no bad local minima. We experimentally confirm these results by finding a phase transition from most regions having full rank Jacobian to many regions having deficient rank depending on the amount of overparameterization.

View on arXiv PDF

Similar