LGAINEFeb 27, 2024

Reinforced In-Context Black-Box Optimization

arXiv:2402.17423v312 citationsh-index: 13IJCAI
Originality Incremental advance
AI Analysis

This addresses the need for flexible, labor-saving optimization methods in fields like science and engineering, though it builds incrementally on meta-learning approaches.

The paper tackled the problem of automating black-box optimization by learning entire algorithms from data, proposing RIBBO which uses sequence models and regret-to-go tokens to generate query points, achieving universally good performance on diverse benchmarks.

Black-Box Optimization (BBO) has found successful applications in many fields of science and engineering. Recently, there has been a growing interest in meta-learning particular components of BBO algorithms to speed up optimization and get rid of tedious hand-crafted heuristics. As an extension, learning the entire algorithm from data requires the least labor from experts and can provide the most flexibility. In this paper, we propose RIBBO, a method to reinforce-learn a BBO algorithm from offline data in an end-to-end fashion. RIBBO employs expressive sequence models to learn the optimization histories produced by multiple behavior algorithms and tasks, leveraging the in-context learning ability of large models to extract task information and make decisions accordingly. Central to our method is to augment the optimization histories with \textit{regret-to-go} tokens, which are designed to represent the performance of an algorithm based on cumulative regret over the future part of the histories. The integration of regret-to-go tokens enables RIBBO to automatically generate sequences of query points that satisfy the user-desired regret, which is verified by its universally good empirical performance on diverse problems, including BBO benchmark functions, hyper-parameter optimization and robot control problems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes