LG AIFeb 8, 2016

Decoy Bandits Dueling on a Poset

arXiv:1602.02706v21.0

Originality Incremental advance

AI Analysis

This work addresses the challenge of decision-making under uncertainty in structured settings like rankings or preferences, offering incremental improvements to existing dueling bandit methods.

The paper tackles the problem of dueling bandits on partially ordered sets where arms may be incomparable and multiple optimal arms exist, proposing two algorithms: UnchainedBandits that finds optimal arms without distinguishing comparability, and SlicingBandits that uses incomparability information for significant performance gains, with theoretical guarantees and experimental evaluation provided.

We adress the problem of dueling bandits defined on partially ordered sets, or posets. In this setting, arms may not be comparable, and there may be several (incomparable) optimal arms. We propose an algorithm, UnchainedBandits, that efficiently finds the set of optimal arms of any poset even when pairs of comparable arms cannot be distinguished from pairs of incomparable arms, with a set of minimal assumptions. This algorithm relies on the concept of decoys, which stems from social psychology. For the easier case where the incomparability information may be accessible, we propose a second algorithm, SlicingBandits, which takes advantage of this information and achieves a very significant gain of performance compared to UnchainedBandits. We provide theoretical guarantees and experimental evaluation for both algorithms.

View on arXiv PDF

Similar