Understanding Global Loss Landscape of One-hidden-layer ReLU Networks, Part 1: Theory
This provides theoretical insights into the loss landscape for neural networks, which is foundational for understanding optimization challenges in machine learning.
The paper proves that for one-hidden-layer ReLU networks, all differentiable local minima are global within differentiable regions, and characterizes their locations and losses, showing they can be isolated points or continuous hyperplanes based on data, activation patterns, and network size.
For one-hidden-layer ReLU networks, we prove that all differentiable local minima are global inside differentiable regions. We give the locations and losses of differentiable local minima, and show that these local minima can be isolated points or continuous hyperplanes, depending on an interplay between data, activation pattern of hidden neurons and network size. Furthermore, we give necessary and sufficient conditions for the existence of saddle points as well as non-differentiable local minima, and their locations if they exist.