MLLGJan 30, 2022

Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times

arXiv:2201.12909v119 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for real-world applications of GP-Opt with high switch costs, such as wet labs or hyperparameter optimization.

The paper tackles the computational bottleneck of Gaussian process optimization (GP-Opt) by showing that evaluating a few unique candidates multiple times reduces the number of unique historical points, enabling efficient exact posterior computation. This approach preserves theoretical guarantees while improving runtime, memory complexity, and parallel evaluation capabilities.

Computing a Gaussian process (GP) posterior has a computational cost cubical in the number of historical points. A reformulation of the same GP posterior highlights that this complexity mainly depends on how many \emph{unique} historical points are considered. This can have important implication in active learning settings, where the set of historical points is constructed sequentially by the learner. We show that sequential black-box optimization based on GPs (GP-Opt) can be made efficient by sticking to a candidate solution for multiple evaluation steps and switch only when necessary. Limiting the number of switches also limits the number of unique points in the history of the GP. Thus, the efficient GP reformulation can be used to exactly and cheaply compute the posteriors required to run the GP-Opt algorithms. This approach is especially useful in real-world applications of GP-Opt with high switch costs (e.g. switching chemicals in wet labs, data/model loading in hyperparameter optimization). As examples of this meta-approach, we modify two well-established GP-Opt algorithms, GP-UCB and GP-EI, to switch candidates as infrequently as possible adapting rules from batched GP-Opt. These versions preserve all the theoretical no-regret guarantees while improving practical aspects of the algorithms such as runtime, memory complexity, and the ability of batching candidates and evaluating them in parallel.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes