LGMay 28, 2025

BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

NVIDIA
arXiv:2505.21974v27 citationsh-index: 14ICLR
Originality Highly original
AI Analysis

This addresses a key bottleneck in MOBO for applications like hyperparameter optimization, offering a novel solution to improve efficiency.

The paper tackles the hypervolume identifiability issue in multi-objective Bayesian optimization (MOBO) by proposing BOFormer, a non-Markovian reinforcement learning framework using Transformers, which consistently outperforms benchmark algorithms in synthetic and real-world problems.

Bayesian optimization (BO) offers an efficient pipeline for optimizing black-box functions with the help of a Gaussian process prior and an acquisition function (AF). Recently, in the context of single-objective BO, learning-based AFs witnessed promising empirical results given its favorable non-myopic nature. Despite this, the direct extension of these approaches to multi-objective Bayesian optimization (MOBO) suffer from the \textit{hypervolume identifiability issue}, which results from the non-Markovian nature of MOBO problems. To tackle this, inspired by the non-Markovian RL literature and the success of Transformers in language modeling, we present a generalized deep Q-learning framework and propose \textit{BOFormer}, which substantiates this framework for MOBO via sequence modeling. Through extensive evaluation, we demonstrate that BOFormer constantly outperforms the benchmark rule-based and learning-based algorithms in various synthetic MOBO and real-world multi-objective hyperparameter optimization problems. We have made the source code publicly available to encourage further research in this direction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes