MLSTJul 26, 2016

Approximation by Combinations of ReLU and Squared ReLU Ridge Functions with $ \ell^1 $ and $ \ell^0 $ Controls

arXiv:1607.07819v3174 citations
Originality Incremental advance
AI Analysis

This work addresses theoretical approximation challenges in neural networks for researchers, offering incremental improvements in error analysis for sparse ridge function models.

The paper tackles the problem of approximating multivariate functions using combinations of ReLU and squared ReLU ridge functions with sparsity controls, establishing L∞ and L² error bounds that are inversely proportional to inner layer sparsity and sublinear in outer layer sparsity, with near-optimal lower bounds provided.

We establish $ L^{\infty} $ and $ L^2 $ error bounds for functions of many variables that are approximated by linear combinations of ReLU (rectified linear unit) and squared ReLU ridge functions with $ \ell^1 $ and $ \ell^0 $ controls on their inner and outer parameters. With the squared ReLU ridge function, we show that the $ L^2 $ approximation error is inversely proportional to the inner layer $ \ell^0 $ sparsity and it need only be sublinear in the outer layer $ \ell^0 $ sparsity. Our constructions are obtained using a variant of the Jones-Barron probabilistic method, which can be interpreted as either stratified sampling with proportionate allocation or two-stage cluster sampling. We also provide companion error lower bounds that reveal near optimality of our constructions. Despite the sparsity assumptions, we showcase the richness and flexibility of these ridge combinations by defining a large family of functions, in terms of certain spectral conditions, that are particularly well approximated by them.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes