On the existence of optimal shallow feedforward networks with ReLU activation
This addresses a foundational theoretical problem in machine learning by clarifying optimization properties specific to ReLU networks, which is incremental but important for understanding network behavior.
The paper proves the existence of global minima in the loss landscape for approximating continuous functions using shallow feedforward neural networks with ReLU activation, establishing that optimal solutions are representable by ReLU networks under mild assumptions.
We prove existence of global minima in the loss landscape for the approximation of continuous target functions using shallow feedforward artificial neural networks with ReLU activation. This property is one of the fundamental artifacts separating ReLU from other commonly used activation functions. We propose a kind of closure of the search space so that in the extended space minimizers exist. In a second step, we show under mild assumptions that the newly added functions in the extension perform worse than appropriate representable ReLU networks. This then implies that the optimal response in the extended target space is indeed the response of a ReLU network.