Algebraic Approach to Ridge-Regularized Mean Squared Error Minimization in Minimal ReLU Neural Network
This provides a theoretical proof-of-concept for exhaustively analyzing optimization landscapes in simple neural networks, but it is incremental and limited to minimal cases.
The paper tackles the problem of finding all local minima in ridge-regularized mean squared error for ReLU perceptrons by using an algebraic approach, resulting in a method that can identify both isolated points and higher-dimensional connected sets as minima, demonstrated on minimal perceptrons with a few hidden units.
This paper investigates a perceptron, a simple neural network model, with ReLU activation and a ridge-regularized mean squared error (RR-MSE). Our approach leverages the fact that the RR-MSE for ReLU perceptron is piecewise polynomial, enabling a systematic analysis using tools from computational algebra. In particular, we develop a Divide-Enumerate-Merge strategy that exhaustively enumerates all local minima of the RR-MSE. By virtue of the algebraic formulation, our approach can identify not only the typical zero-dimensional minima (i.e., isolated points) obtained by numerical optimization, but also higher-dimensional minima (i.e., connected sets such as curves, surfaces, or hypersurfaces). Although computational algebraic methods are computationally very intensive for perceptrons of practical size, as a proof of concept, we apply the proposed approach in practice to minimal perceptrons with a few hidden units.