Optimal Function Approximation with Relu Neural Networks
This work addresses a foundational problem in neural network theory for researchers in approximation theory and machine learning, but it is incremental as it builds on existing knowledge of ReLU networks and convex functions.
The paper tackles the problem of optimally approximating convex univariate functions using ReLU neural networks by determining the minimal error given a number of linear pieces, establishing necessary and sufficient conditions, uniqueness, and bounds for these approximations, and presenting architectures and an algorithm with proven convergence and experimental validation.
We consider in this paper the optimal approximations of convex univariate functions with feed-forward Relu neural networks. We are interested in the following question: what is the minimal approximation error given the number of approximating linear pieces? We establish the necessary and sufficient conditions and uniqueness of optimal approximations, and give lower and upper bounds of the optimal approximation errors. Relu neural network architectures are then presented to generate these optimal approximations. Finally, we propose an algorithm to find the optimal approximations, as well as prove its convergence and validate it with experimental results.