PR MLMar 6

Large deviation principles for convolutional Bayesian neural networks

Federico Bassetti, Vassili De Palma, Lucia Ladelli

arXiv:2603.06023v18.3h-index: 17

Predicted impact top 32% in PR · last 90 daysOriginality Highly original

AI Analysis

This work provides a foundational theoretical understanding of the non-Gaussian behavior of infinite-channel CNNs, which is important for researchers studying the theoretical underpinnings of neural networks.

This paper establishes a large deviation principle (LDP) for convolutional neural networks (CNNs) in the infinite-channel regime, specifically for the sequence of conditional covariance matrices under a Gaussian prior and for the posterior distribution. This extends the understanding of CNN behavior beyond the known Gaussian limit.

While suitably scaled CNNs with Gaussian initialization are known to converge to Gaussian processes as the number of channels diverges, little is known beyond this Gaussian limit. We establish a large deviation principle (LDP) for convolutional neural networks in the infinite-channel regime. We consider a broad class of multidimensional CNN architectures characterized by general receptive fields encoded through a patch-extractor function satisfying mild structural assumptions. Our main result establishes a large deviation principle (LDP) for the sequence of conditional covariance matrices under Gaussian prior distribution on the weights. We further derive an LDP for the posterior distribution obtained by conditioning on a finite number of observations. In addition, we provide a streamlined proof of the concentration of the conditional covariances and of the Gaussian equivalence of the network. To the best of our knowledge, this is the first large deviation principle established for convolutional neural networks.

View on arXiv PDF

Similar