Marginalising over Stationary Kernels with Bayesian Quadrature
This addresses the problem of computational inefficiency in kernel marginalization for Gaussian Processes, making it more practical for larger datasets, though it appears incremental as it builds on existing Bayesian Quadrature and kernel methods.
The paper tackles the computational expense of marginalizing over Gaussian Process kernels for flexible models by proposing a Bayesian Quadrature scheme that uses maximum mean discrepancies to define a kernel over kernels and an information-theoretic acquisition function. The result is more accurate predictions with better calibrated uncertainty than state-of-the-art baselines, particularly under limited time constraints.
Marginalising over families of Gaussian Process kernels produces flexible model classes with well-calibrated uncertainty estimates. Existing approaches require likelihood evaluations of many kernels, rendering them prohibitively expensive for larger datasets. We propose a Bayesian Quadrature scheme to make this marginalisation more efficient and thereby more practical. Through use of the maximum mean discrepancies between distributions, we define a kernel over kernels that captures invariances between Spectral Mixture (SM) Kernels. Kernel samples are selected by generalising an information-theoretic acquisition function for warped Bayesian Quadrature. We show that our framework achieves more accurate predictions with better calibrated uncertainty than state-of-the-art baselines, especially when given limited (wall-clock) time budgets.