MLCCLGCOSTOct 31, 2017

Approximating Continuous Functions by ReLU Nets of Minimal Width

arXiv:1710.11278v2276 citations
Originality Highly original
AI Analysis

This provides a foundational theoretical result for deep learning, addressing a core question in neural network expressivity for researchers and practitioners.

The paper tackles the problem of determining the minimal width required for ReLU neural nets to approximate any continuous function, finding that the minimal width is exactly d_in+1, and constructs nets with width d_in+d_out to approximate functions with quantitative depth estimates based on modulus of continuity.

This article concerns the expressive power of depth in deep feed-forward neural nets with ReLU activations. Specifically, we answer the following question: for a fixed $d_{in}\geq 1,$ what is the minimal width $w$ so that neural nets with ReLU activations, input dimension $d_{in}$, hidden layer widths at most $w,$ and arbitrary depth can approximate any continuous, real-valued function of $d_{in}$ variables arbitrarily well? It turns out that this minimal width is exactly equal to $d_{in}+1.$ That is, if all the hidden layer widths are bounded by $d_{in}$, then even in the infinite depth limit, ReLU nets can only express a very limited class of functions, and, on the other hand, any continuous function on the $d_{in}$-dimensional unit cube can be approximated to arbitrary precision by ReLU nets in which all hidden layers have width exactly $d_{in}+1.$ Our construction in fact shows that any continuous function $f:[0,1]^{d_{in}}\to\mathbb R^{d_{out}}$ can be approximated by a net of width $d_{in}+d_{out}$. We obtain quantitative depth estimates for such an approximation in terms of the modulus of continuity of $f$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes