Quantified advantage of discontinuous weight selection in approximations with deep neural networks
This addresses a theoretical limitation in neural network approximation theory, offering an incremental improvement for researchers in mathematical analysis of deep learning.
The paper tackles the problem of approximating 1D Lipschitz functions using deep ReLU networks, showing that allowing discontinuous weight selection reduces the uniform approximation error by at least a logarithmic factor compared to continuous weight selection.
We consider approximations of 1D Lipschitz functions by deep ReLU networks of a fixed width. We prove that without the assumption of continuous weight selection the uniform approximation error is lower than with this assumption at least by a factor logarithmic in the size of the network.