Approximation of Smoothness Classes by Deep Rectifier Networks
This provides theoretical guarantees for deep learning models in approximating smooth functions, which is incremental as it extends existing approximation theory to broader smoothness classes.
The paper tackles the problem of approximating functions in Besov spaces using deep rectifier networks, showing that these networks achieve optimal or near-optimal approximation rates for arbitrary smoothness orders and dimensions.
We consider approximation rates of sparsely connected deep rectified linear unit (ReLU) and rectified power unit (RePU) neural networks for functions in Besov spaces $B^α_{q}(L^p)$ in arbitrary dimension $d$, on general domains. We show that \alert{deep rectifier} networks with a fixed activation function attain optimal or near to optimal approximation rates for functions in the Besov space $B^α_τ(L^τ)$ on the critical embedding line $1/τ=α/d+1/p$ for \emph{arbitrary} smoothness order $α>0$. Using interpolation theory, this implies that the entire range of smoothness classes at or above the critical line is (near to) optimally approximated by deep ReLU/RePU networks.