LGMLMar 27, 2020

Incorporating Expert Prior in Bayesian Optimisation via Space Warping

arXiv:2003.12250v145 citations
AI Analysis

This addresses the issue of expensive function evaluations in optimization for researchers and practitioners, but it is incremental as it builds on existing Bayesian optimization methods.

The paper tackles the problem of Bayesian optimization's cold start phase in large search spaces by incorporating expert prior knowledge as a distribution to warp the search space, expanding high-probability regions and shrinking low-probability ones, and shows superiority over standard methods in benchmark functions and hyperparameter tuning for SVM and Random Forest.

Bayesian optimisation is a well-known sample-efficient method for the optimisation of expensive black-box functions. However when dealing with big search spaces the algorithm goes through several low function value regions before reaching the optimum of the function. Since the function evaluations are expensive in terms of both money and time, it may be desirable to alleviate this problem. One approach to subside this cold start phase is to use prior knowledge that can accelerate the optimisation. In its standard form, Bayesian optimisation assumes the likelihood of any point in the search space being the optimum is equal. Therefore any prior knowledge that can provide information about the optimum of the function would elevate the optimisation performance. In this paper, we represent the prior knowledge about the function optimum through a prior distribution. The prior distribution is then used to warp the search space in such a way that space gets expanded around the high probability region of function optimum and shrinks around low probability region of optimum. We incorporate this prior directly in function model (Gaussian process), by redefining the kernel matrix, which allows this method to work with any acquisition function, i.e. acquisition agnostic approach. We show the superiority of our method over standard Bayesian optimisation method through optimisation of several benchmark functions and hyperparameter tuning of two algorithms: Support Vector Machine (SVM) and Random forest.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes