ML LG PRMay 15, 2024

Spectral complexity of deep neural networks

Simmaco Di Lillo, Domenico Marinucci, Michele Salvi, Stefano Vigogna

arXiv:2405.09541v410.74 citationsh-index: 7SIAM J Math Data Sci

Originality Incremental advance

AI Analysis

This work provides a theoretical framework for understanding architectural complexity in deep learning, which is incremental as it builds on known Gaussian process limits.

The authors tackled the problem of characterizing the complexity of deep neural network architectures by analyzing the angular power spectrum of the limiting field as width and depth diverge, resulting in a classification of networks into low-disorder, sparse, or high-disorder categories with validation through numerical simulations.

It is well-known that randomly initialized, push-forward, fully-connected neural networks weakly converge to isotropic Gaussian processes, in the limit where the width of all layers goes to infinity. In this paper, we propose to use the angular power spectrum of the limiting field to characterize the complexity of the network architecture. In particular, we define sequences of random variables associated with the angular power spectrum, and provide a full characterization of the network complexity in terms of the asymptotic distribution of these sequences as the depth diverges. On this basis, we classify neural networks as low-disorder, sparse, or high-disorder; we show how this classification highlights a number of distinct features for standard activation functions, and in particular, sparsity properties of ReLU networks. Our theoretical results are also validated by numerical simulations.

View on arXiv PDF

Similar