ML LGOct 16, 2020

The Ridgelet Prior: A Covariance Function Approach to Prior Specification for Bayesian Neural Networks

Takuo Matsubara, Chris J. Oates, François-Xavier Briol

arXiv:2010.08488v413.024 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of uncertainty quantification in Bayesian neural networks for researchers and practitioners, though it is incremental as it builds on existing connections between neural networks and Gaussian processes.

The paper tackles the problem of specifying meaningful prior distributions for Bayesian neural networks by proposing the ridgelet prior, which approximates a user-specified Gaussian process covariance function in the output space, with non-asymptotic error bounds provided. In experiments, it outperforms unstructured priors on regression tasks where a suitable Gaussian process prior is available.

Bayesian neural networks attempt to combine the strong predictive performance of neural networks with formal quantification of uncertainty associated with the predictive output in the Bayesian framework. However, it remains unclear how to endow the parameters of the network with a prior distribution that is meaningful when lifted into the output space of the network. A possible solution is proposed that enables the user to posit an appropriate Gaussian process covariance function for the task at hand. Our approach constructs a prior distribution for the parameters of the network, called a ridgelet prior, that approximates the posited Gaussian process in the output space of the network. In contrast to existing work on the connection between neural networks and Gaussian processes, our analysis is non-asymptotic, with finite sample-size error bounds provided. This establishes the universality property that a Bayesian neural network can approximate any Gaussian process whose covariance function is sufficiently regular. Our experimental assessment is limited to a proof-of-concept, where we demonstrate that the ridgelet prior can out-perform an unstructured prior on regression problems for which a suitable Gaussian process prior can be provided.

View on arXiv PDF Code

Similar