Function-Space Empirical Bayes Regularisation with Large Vision-Language Model Priors
This work addresses the problem of scaling Bayesian deep learning to high-dimensional data for researchers and practitioners needing reliable uncertainty quantification.
The paper tackles the challenge of designing informative prior distributions in Bayesian deep learning by proposing VLM-FS-EB, a function-space empirical Bayes regularization framework that uses large vision-language models to generate semantically meaningful context points for constructing expressive functional priors. The method consistently improves predictive performance and yields more reliable uncertainty estimates, particularly in out-of-distribution detection tasks and data-scarce regimes.
Bayesian deep learning (BDL) provides a principled framework for reliable uncertainty quantification by combining deep neural networks with Bayesian inference. A central challenge in BDL lies in the design of informative prior distributions that scale effectively to high-dimensional data. Recent functional variational inference (VI) approaches address this issue by imposing priors directly in function space; however, most existing methods rely on Gaussian process (GP) priors, whose expressiveness and generalisation capabilities become limited in high-dimensional regimes. In this work, we propose VLM-FS-EB, a novel function-space empirical Bayes regularisation framework, leveraging large vision-language models (VLMs) to generates semantically meaningful context points. These synthetic samples are then used VLMs for embeddings to construct expressive functional priors. Furthermore, the proposed method is evaluated against various baselines, and experimental results demonstrate that our method consistently improves predictive performance and yields more reliable uncertainty estimates, particularly in out-of-distribution (OOD) detection tasks and data-scarce regimes.