MLLGJun 9, 2025

Constrained Pareto Set Identification with Bandit Feedback

arXiv:2506.08127v11 citationsh-index: 3ICML
Originality Highly original
AI Analysis

This addresses the challenge of efficiently identifying optimal arms with multiple objectives and constraints in bandit problems, which is incremental but improves upon prior approaches.

The paper tackles the problem of identifying the Pareto Set under feasibility constraints in a multivariate bandit setting, introducing an algorithm that significantly outperforms existing methods and proving a near-optimal sample complexity lower bound.

In this paper, we address the problem of identifying the Pareto Set under feasibility constraints in a multivariate bandit setting. Specifically, given a $K$-armed bandit with unknown means $μ_1, \dots, μ_K \in \mathbb{R}^d$, the goal is to identify the set of arms whose mean is not uniformly worse than that of another arm (i.e., not smaller for all objectives), while satisfying some known set of linear constraints, expressing, for example, some minimal performance on each objective. Our focus lies in fixed-confidence identification, for which we introduce an algorithm that significantly outperforms racing-like algorithms and the intuitive two-stage approach that first identifies feasible arms and then their Pareto Set. We further prove an information-theoretic lower bound on the sample complexity of any algorithm for constrained Pareto Set identification, showing that the sample complexity of our approach is near-optimal. Our theoretical results are supported by an extensive empirical evaluation on a series of benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes