Bandit Pareto Set Identification: the Fixed Budget Setting
This work addresses the Pareto set identification problem for decision-making under uncertainty, offering a novel algorithmic approach with theoretical guarantees, though it is incremental in extending bandit methods to multi-objective settings.
The paper tackles the problem of identifying the Pareto optimal set in multi-objective bandits under a fixed budget constraint, proposing the Empirical Gap Elimination algorithm family, with instances EGE-SR and EGE-SH shown to achieve exponentially decaying error probabilities supported by information-theoretic bounds.
We study a multi-objective pure exploration problem in a multi-armed bandit model. Each arm is associated to an unknown multi-variate distribution and the goal is to identify the distributions whose mean is not uniformly worse than that of another distribution: the Pareto optimal set. We propose and analyze the first algorithms for the \emph{fixed budget} Pareto Set Identification task. We propose Empirical Gap Elimination, a family of algorithms combining a careful estimation of the ``hardness to classify'' each arm in or out of the Pareto set with a generic elimination scheme. We prove that two particular instances, EGE-SR and EGE-SH, have a probability of error that decays exponentially fast with the budget, with an exponent supported by an information theoretic lower-bound. We complement these findings with an empirical study using real-world and synthetic datasets, which showcase the good performance of our algorithms.