MECOMLJan 8, 2021

Fast calculation of Gaussian Process multiple-fold cross-validation residuals and their covariances

arXiv:2101.03108v32 citations
Originality Incremental advance
AI Analysis

This work provides faster and more accurate model diagnostics and parameter fitting for practitioners using Gaussian Processes, particularly in contaminant localization.

This paper generalizes fast Gaussian process leave-one-out formulae to multiple-fold cross-validation, enabling faster calculation of cross-validation residuals and their covariances. Numerical experiments demonstrate substantial speed-ups compared to naive implementations, although these benefits decrease with fewer folds.

We generalize fast Gaussian process leave-one-out formulae to multiple-fold cross-validation, highlighting in turn the covariance structure of cross-validation residuals in both Simple and Universal Kriging frameworks. We illustrate how resulting covariances affect model diagnostics. We further establish in the case of noiseless observations that correcting for covariances between residuals in cross-validation-based estimation of the scale parameter leads back to MLE. Also, we highlight in broader settings how differences between pseudo-likelihood and likelihood methods boil down to accounting or not for residual covariances. The proposed fast calculation of cross-validation residuals is implemented and benchmarked against a naive implementation. Numerical experiments highlight the accuracy and substantial speed-ups that our approach enables. However, as supported by a discussion on main drivers of computational costs and by a numerical benchmark, speed-ups steeply decline as the number of folds (say, all sharing the same size) decreases. An application to a contaminant localization test case illustrates that grouping clustered observations in folds may help improving model assessment and parameter fitting compared to Leave-One-Out. Overall, our results enable fast multiple-fold cross-validation, have direct consequences in model diagnostics, and pave the way to future work on hyperparameter fitting and on the promising field of goal-oriented fold design.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes