A Geometric Characterization of the Stationary Plateau for Two-Layer Neural Networks
Provides theoretical insights into how width expansion affects the loss landscape, relevant for understanding optimization and overparameterization in neural networks.
The paper characterizes the geometric structure of stationary plateaus in two-layer neural networks, showing that splitting a local minimum can yield either a mixture of minima and saddles or all saddles, while splitting a saddle always produces saddles. The analysis uses a per-neuron 'inner Hessian' to classify stationary points.
We investigate the geometric structure of stationary plateaus that arise in the loss landscape of two-layer neural networks with smooth activation functions. We focus on the phenomenon of "neuron splitting" where duplicating a hidden neuron yields an affine set of stationary points in a wider network. We provide a comprehensive classification of all stationary points on these plateaus, determining under what conditions they constitute local minima or saddle points. Our characterization hinges on a per-neuron curvature object we term the "inner Hessian" matrix. Our analysis reveals that the definiteness of the inner Hessian and the choice of splitting coefficients jointly dictate the local geometry of the plateau. We show that "splitting" a local minimum can yield either a mixture of local minima and saddles or an all-saddle plateau, with a concrete sure-saddle region identified under mild assumptions. In contrast, splitting a saddle point always produces a plateau of saddle points. Our results unify and extend prior landscape analyses, elucidating when and how model expansion preserves or alters the nature of stationary points. These findings offer new geometric insights into the effects of width expansion and reparameterization in neural networks.