MLCGLGFASTAug 9, 2017

Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations

arXiv:1708.02691v3390 citations
Originality Highly original
AI Analysis

This addresses a foundational theoretical question in machine learning about the expressive power of neural network architecture, with implications for model design and efficiency.

The paper tackles the problem of determining the minimal width required for deep ReLU neural networks to universally approximate continuous functions on the unit cube, proving that width d+1 suffices for convex functions and providing quantitative depth estimates for width d+3.

This article concerns the expressive power of depth in neural nets with ReLU activations and bounded width. We are particularly interested in the following questions: what is the minimal width $w_{\text{min}}(d)$ so that ReLU nets of width $w_{\text{min}}(d)$ (and arbitrary depth) can approximate any continuous function on the unit cube $[0,1]^d$ aribitrarily well? For ReLU nets near this minimal width, what can one say about the depth necessary to approximate a given function? Our approach to this paper is based on the observation that, due to the convexity of the ReLU activation, ReLU nets are particularly well-suited for representing convex functions. In particular, we prove that ReLU nets with width $d+1$ can approximate any continuous convex function of $d$ variables arbitrarily well. These results then give quantitative depth estimates for the rate of approximation of any continuous scalar function on the $d$-dimensional cube $[0,1]^d$ by ReLU nets with width $d+3.$

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes