Language Model Embeddings Can Be Sufficient for Bayesian Optimization
This enables more flexible and general-purpose Bayesian Optimization across diverse domains, such as synthetic, combinatorial, and hyperparameter optimization, for researchers and practitioners in experimental design and black-box optimization.
The paper tackled the problem of Bayesian Optimization's reliance on regression models with fixed search spaces and structured inputs by using LLM embeddings over string inputs for in-context regression, achieving optimization performance comparable to state-of-the-art Gaussian Process-based methods like Google Vizier.
Bayesian Optimization is ubiquitous in experimental design and black-box optimization for improving search efficiency. However, most existing approaches rely on regression models which are limited to fixed search spaces and structured, tabular input features. This paper explores the use of LLM embeddings over string inputs for in-context regression in Bayesian Optimization. Our results show that representing inputs as strings enables general-purpose regression across diverse domains, including synthetic, combinatorial, and hyperparameter optimization. Furthermore, our approach achieves optimization performance comparable to state-of-the-art Gaussian Process-based methods such as Google Vizier, and demonstrates potential for broader and more flexible applications.