Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes
This work addresses uncertainty quantification in machine learning, revealing counterintuitive dimensional effects that could impact model reliability assessment, though it is incremental in building on existing UQ research.
The paper proves that for Gaussian Processes, maximizing marginal likelihood leads to monotonic performance improvement with input dimension, while cross-validation metrics show double descent behavior, with cold posteriors exacerbating these effects, verified empirically on real data.
Despite their importance for assessing reliability of predictions, uncertainty quantification (UQ) measures for machine learning models have only recently begun to be rigorously characterized. One prominent issue is the curse of dimensionality: it is commonly believed that the marginal likelihood should be reminiscent of cross-validation metrics and that both should deteriorate with larger input dimensions. We prove that by tuning hyperparameters to maximize marginal likelihood (the empirical Bayes procedure), the performance, as measured by the marginal likelihood, improves monotonically} with the input dimension. On the other hand, we prove that cross-validation metrics exhibit qualitatively different behavior that is characteristic of double descent. Cold posteriors, which have recently attracted interest due to their improved performance in certain settings, appear to exacerbate these phenomena. We verify empirically that our results hold for real data, beyond our considered assumptions, and we explore consequences involving synthetic covariates.