On the Optimal Expressive Power of ReLU DNNs and Its Application in Approximation with Kolmogorov Superposition Theorem
This work addresses foundational theoretical limits in neural network design for researchers in machine learning and approximation theory, providing optimal bounds and improved approximation rates.
This paper tackles the problem of understanding the expressive power of ReLU deep neural networks by proving that any continuous piecewise linear function with O(N^2L) segments can be represented by such networks with L hidden layers and N neurons per layer, and shows this construction is optimal in parameter count. It also applies the Kolmogorov Superposition Theorem to achieve an enhanced approximation rate for continuous functions in high-dimensional spaces.
This paper is devoted to studying the optimal expressive power of ReLU deep neural networks (DNNs) and its application in approximation via the Kolmogorov Superposition Theorem. We first constructively prove that any continuous piecewise linear functions on $[0,1]$, comprising $O(N^2L)$ segments, can be represented by ReLU DNNs with $L$ hidden layers and $N$ neurons per layer. Subsequently, we demonstrate that this construction is optimal regarding the parameter count of the DNNs, achieved through investigating the shattering capacity of ReLU DNNs. Moreover, by invoking the Kolmogorov Superposition Theorem, we achieve an enhanced approximation rate for ReLU DNNs of arbitrary width and depth when dealing with continuous functions in high-dimensional spaces.