Uniform Convergence of Deep Neural Networks with Lipschitz Continuous Activation Functions and Variable Widths
This work addresses theoretical convergence issues for deep learning practitioners, offering a framework applicable to common activation functions, but it is incremental as it builds on existing analysis with variable widths.
The paper tackles the problem of ensuring uniform convergence of deep neural networks with Lipschitz continuous activation functions and variable widths as layers increase, providing sufficient conditions on weights, biases, and Lipschitz constants to guarantee convergence to a meaningful function, with specific results for fixed, bounded, and unbounded widths and convolutional neural networks.
We consider deep neural networks with a Lipschitz continuous activation function and with weight matrices of variable widths. We establish a uniform convergence analysis framework in which sufficient conditions on weight matrices and bias vectors together with the Lipschitz constant are provided to ensure uniform convergence of the deep neural networks to a meaningful function as the number of their layers tends to infinity. In the framework, special results on uniform convergence of deep neural networks with a fixed width, bounded widths and unbounded widths are presented. In particular, as convolutional neural networks are special deep neural networks with weight matrices of increasing widths, we put forward conditions on the mask sequence which lead to uniform convergence of resulting convolutional neural networks. The Lipschitz continuity assumption on the activation functions allows us to include in our theory most of commonly used activation functions in applications.