LG AIJun 16, 2017

Structured Best Arm Identification with Fixed Confidence

Ruitong Huang, Mohammad M. Ajallooeian, Csaba Szepesvári, Martin Müller

arXiv:1706.05198v210.928 citations

Originality Incremental advance

AI Analysis

This work addresses structured decision-making in AI, particularly for general minimax games, offering incremental improvements by extending analysis to non-uniform depths and transpositions.

The paper tackles the problem of identifying the best action in structured settings like minimax game search by introducing an abstract framework and the LUCB-micro algorithm, providing lower and upper sample complexity bounds that generalize previous limited results.

We study the problem of identifying the best action among a set of possible options when the value of each action is given by a mapping from a number of noisy micro-observables in the so-called fixed confidence setting. Our main motivation is the application to the minimax game search, which has been a major topic of interest in artificial intelligence. In this paper we introduce an abstract setting to clearly describe the essential properties of the problem. While previous work only considered a two-move game tree search problem, our abstract setting can be applied to the general minimax games where the depth can be non-uniform and arbitrary, and transpositions are allowed. We introduce a new algorithm (LUCB-micro) for the abstract setting, and give its lower and upper sample complexity results. Our bounds recover some previous results, which were only available in more limited settings, while they also shed further light on how the structure of minimax problems influence sample complexity.

View on arXiv PDF

Similar