GLISp-r: A preference-based optimization algorithm with convergence guarantees
This work addresses optimization problems where explicit objective functions are unavailable, relying on human preferences, which is incremental as it builds on the existing GLISp method.
The authors tackled the problem of preference-based optimization, where a decision-maker iteratively compares pairs of tunings to find the most preferred calibration while minimizing comparisons, by proposing GLISp-r, an extension of GLISp with a new candidate selection criterion inspired by MSRS. The result shows that GLISp-r is less likely to get stuck in local optima, supported by a proof of global convergence and empirical comparisons on benchmark problems.
Preference-based optimization algorithms are iterative procedures that seek the optimal calibration of a decision vector based only on comparisons between couples of different tunings. At each iteration, a human decision-maker expresses a preference between two calibrations (samples), highlighting which one, if any, is better than the other. The optimization procedure must use the observed preferences to find the tuning of the decision vector that is most preferred by the decision-maker, while also minimizing the number of comparisons. In this work, we formulate the preference-based optimization problem from a utility theory perspective. Then, we propose GLISp-r, an extension of a recent preference-based optimization procedure called GLISp. The latter uses a Radial Basis Function surrogate to describe the tastes of the decision-maker. Iteratively, GLISp proposes new samples to compare with the best calibration available by trading off exploitation of the surrogate model and exploration of the decision space. In GLISp-r, we propose a different criterion to use when looking for new candidate samples that is inspired by MSRS, a popular procedure in the black-box optimization framework. Compared to GLISp, GLISp-r is less likely to get stuck on local optima of the preference-based optimization problem. We motivate this claim theoretically, with a proof of global convergence, and empirically, by comparing the performances of GLISp and GLISp-r on several benchmark optimization problems.