Approximation in $L^p(μ)$ with deep ReLU neural networks
This work provides a theoretical extension for approximation in L^p spaces, relevant for machine learning practitioners dealing with probability measures, but it is incremental as it builds on prior fixed-depth results.
The paper generalizes existing approximation results for deep ReLU neural networks with fixed depth from the Lebesgue measure to any finite Borel measure, enabling application in statistical learning theory for data distributions.
We discuss the expressive power of neural networks which use the non-smooth ReLU activation function $\varrho(x) = \max\{0,x\}$ by analyzing the approximation theoretic properties of such networks. The existing results mainly fall into two categories: approximation using ReLU networks with a fixed depth, or using ReLU networks whose depth increases with the approximation accuracy. After reviewing these findings, we show that the results concerning networks with fixed depth--- which up to now only consider approximation in $L^p(λ)$ for the Lebesgue measure $λ$--- can be generalized to approximation in $L^p(μ)$, for any finite Borel measure $μ$. In particular, the generalized results apply in the usual setting of statistical learning theory, where one is interested in approximation in $L^2(\mathbb{P})$, with the probability measure $\mathbb{P}$ describing the distribution of the data.