On best approximation by multivariate ridge functions with applications to generalized translation networks
This work provides foundational theoretical bounds for approximation theory in machine learning, particularly for neural networks, but is incremental as it generalizes known univariate results to the multivariate case.
The paper tackles the problem of approximating Sobolev functions using sums of multivariate ridge functions, proving sharp upper and lower bounds that asymptotically behave as n^{-r/(d-ℓ)}, where r is the regularity and d and ℓ are dimensions, and applies these results to derive bounds for generalized translation and complex-valued neural networks.
In this paper, we prove sharp upper and lower bounds for the approximation of Sobolev functions by sums of multivariate ridge functions, i.e., for approximation by functions of the form $\mathbb{R}^d \ni x \mapsto \sum_{k=1}^n \varrho_k(A_k x) \in \mathbb{R}$ with $\varrho_k : \mathbb{R}^\ell \to \mathbb{R}$ and $A_k \in \mathbb{R}^{\ell \times d}$. We show that the order of approximation asymptotically behaves as $n^{-r/(d-\ell)}$, where $r$ is the regularity (order of differentiability) of the Sobolev functions to be approximated. Our lower bound even holds when approximating $L^\infty$-Sobolev functions of regularity $r$ with error measured in $L^1$, while our upper bound applies to the approximation of $L^p$-Sobolev functions in $L^p$ for any $1 \leq p \leq \infty$. These bounds generalize well-known results regarding the approximation properties of univariate ridge functions to the multivariate case. We use our results to obtain sharp asymptotic bounds for the approximation of Sobolev functions using generalized translation networks and complex-valued neural networks.