On Enhancing Expressive Power via Compositions of Single Fixed-Size ReLU Network
This provides theoretical insights into the approximation capabilities of continuous-depth networks for researchers in machine learning theory, though it is incremental as it builds on existing frameworks of function compositions.
The paper tackles the problem of understanding the expressive power of deep neural networks by showing that repeated compositions of a single fixed-size ReLU network can approximate 1-Lipschitz continuous functions on [0,1]^d with an error of O(r^{-1/d}), where r is the number of compositions, and extends this to generic continuous functions.
This paper explores the expressive power of deep neural networks through the framework of function compositions. We demonstrate that the repeated compositions of a single fixed-size ReLU network exhibit surprising expressive power, despite the limited expressive capabilities of the individual network itself. Specifically, we prove by construction that $\mathcal{L}_2\circ \boldsymbol{g}^{\circ r}\circ \boldsymbol{\mathcal{L}}_1$ can approximate $1$-Lipschitz continuous functions on $[0,1]^d$ with an error $\mathcal{O}(r^{-1/d})$, where $\boldsymbol{g}$ is realized by a fixed-size ReLU network, $\boldsymbol{\mathcal{L}}_1$ and $\mathcal{L}_2$ are two affine linear maps matching the dimensions, and $\boldsymbol{g}^{\circ r}$ denotes the $r$-times composition of $\boldsymbol{g}$. Furthermore, we extend such a result to generic continuous functions on $[0,1]^d$ with the approximation error characterized by the modulus of continuity. Our results reveal that a continuous-depth network generated via a dynamical system has immense approximation power even if its dynamics function is time-independent and realized by a fixed-size ReLU network.