Quasi-Equivalence of Width and Depth of Neural Networks
This work addresses a foundational question in neural network theory, providing insights into the interplay between width and depth, which could influence architecture design across machine learning.
The authors investigated whether neural network design should favor width or depth, establishing a quasi-equivalence between width and depth for ReLU networks by developing transforms to map any network to wide or deep versions with similar capabilities, and extended this to quadratic neurons using polynomial representations, showing deep and wide networks are interchangeable with negligible error.
While classic studies proved that wide networks allow universal approximation, recent research and successes of deep learning demonstrate the power of deep networks. Based on a symmetric consideration, we investigate if the design of artificial neural networks should have a directional preference, and what the mechanism of interaction is between the width and depth of a network. Inspired by the De Morgan law, we address this fundamental question by establishing a quasi-equivalence between the width and depth of ReLU networks in two aspects. First, we formulate two transforms for mapping an arbitrary ReLU network to a wide network and a deep network respectively for either regression or classification so that the essentially same capability of the original network can be implemented. Then, we replace the mainstream artificial neuron type with a quadratic counterpart, and utilize the factorization and continued fraction representations of the same polynomial function to construct a wide network and a deep network, respectively. Based on our findings, a deep network has a wide equivalent, and vice versa, subject to an arbitrarily small error.