ML LG MENov 15, 2022

Provably Reliable Large-Scale Sampling from Gaussian Processes

Anthony Stephenson, Robert Allison, Edward Pyzer-Knapp

arXiv:2211.08036v35.32 citationsh-index: 23Has Code

Originality Incremental advance

AI Analysis

This enables efficient benchmarking of GP approximations, but is incremental as it focuses on improving data generation rather than core GP methods.

The paper tackles the problem of generating large-scale synthetic datasets from Gaussian processes (GPs) for evaluating approximate methods, achieving scalability to large n while providing provable guarantees that samples are indistinguishable from the desired GP.

When comparing approximate Gaussian process (GP) models, it can be helpful to be able to generate data from any GP. If we are interested in how approximate methods perform at scale, we may wish to generate very large synthetic datasets to evaluate them. Naïvely doing so would cost \(\mathcal{O}(n^3)\) flops and \(\mathcal{O}(n^2)\) memory to generate a size \(n\) sample. We demonstrate how to scale such data generation to large \(n\) whilst still providing guarantees that, with high probability, the sample is indistinguishable from a sample from the desired GP.

View on arXiv PDF Code

Similar