OC LG NA MLFeb 28, 2023

On the existence of minimizers in shallow residual ReLU neural network optimization landscapes

Steffen Dereich, Arnulf Jentzen, Sebastian Kassing

arXiv:2302.14690v210.19 citationsh-index: 50

Originality Incremental advance

AI Analysis

This addresses a theoretical gap in optimization for neural networks, providing existence guarantees that can inform training and analysis, though it is incremental as it builds on prior work.

The paper proves that minimizers exist in the loss landscape for shallow residual ReLU neural networks, contrasting earlier results that showed non-existence for smooth activation functions even with polynomial targets.

In this article, we show existence of minimizers in the loss landscape for residual artificial neural networks (ANNs) with multi-dimensional input layer and one hidden layer with ReLU activation. Our work contrasts earlier results in [D. Gallon, A. Jentzen, and F. Lindner, preprint, arXiv:2211.15641, 2022] and [P. Petersen, M. Raslan, and F. Voigtlaender, Found. Comput. Math., 21 (2021), pp. 375-444] which showed that in many situations minimizers do not exist for common smooth activation functions even in the case where the target functions are polynomials. The proof of the existence property makes use of a closure of the search space containing all functions generated by ANNs and additional discontinuous generalized responses. As we will show, the additional generalized responses in this larger space are suboptimal so that the minimum is attained in the original function class.

View on arXiv PDF

Similar