Minimum Width of Deep Narrow Networks for Universal Approximation
This work addresses a fundamental theoretical problem in neural network design, providing specific bounds that are incremental to existing knowledge in the field.
The paper tackles the problem of determining the minimum width of fully connected neural networks for universal approximation, showing bounds such as w_min ≤ max(2d_x+1, d_y) for networks with ELU and SELU, and d_x+1 ≤ w_min ≤ d_x+d_y for networks with LeakyReLU, ELU, CELU, SELU, and Softplus.
Determining the minimum width of fully connected neural networks has become a fundamental problem in recent theoretical studies of deep neural networks. In this paper, we study the lower bounds and upper bounds of the minimum width required for fully connected neural networks in order to have universal approximation capability, which is important in network design and training. We show that $w_{min}\leq\max(2d_x+1, d_y)$ for networks with ELU, SELU, and the upper bound of this inequality is attained when $d_y=2d_x$, where $d_x$, $d_y$ denote the input and output dimensions, respectively. Besides, we show that $d_x+1\leq w_{min}\leq d_x+d_y$ for networks with LeakyReLU, ELU, CELU, SELU, Softplus, by proving that ReLU can be approximated by these activation functions. In addition, in the case that the activation function is injective or can be uniformly approximated by a sequence of injective functions (e.g., ReLU), we present a new proof of the inequality $w_{min}\ge d_y+\mathbf{1}_{d_x<d_y\leq2d_x}$ by constructing a more intuitive example via a new geometric approach based on Poincar$\acute{\text{e}}$-Miranda Theorem.