LGAug 21, 2023

Cost-Efficient Online Decision Making: A Combinatorial Multi-Armed Bandit Approach

Arman Rahbar, Niklas Åkerblom, Morteza Haghir Chehreghani

arXiv:2308.10699v35.32 citationsh-index: 16Has Code

Originality Incremental advance

AI Analysis

This work addresses cost efficiency in online decision-making for applications where performing all tests is prohibitive, representing an incremental improvement with a novel formulation.

The paper tackles the problem of expensive online decision-making by formulating it as a combinatorial multi-armed bandit with test costs, resulting in a new framework that uses posterior sampling or BayesUCB for exploration and includes theoretical analysis and experimental validation.

Online decision making plays a crucial role in numerous real-world applications. In many scenarios, the decision is made based on performing a sequence of tests on the incoming data points. However, performing all tests can be expensive and is not always possible. In this paper, we provide a novel formulation of the online decision making problem based on combinatorial multi-armed bandits and take the (possibly stochastic) cost of performing tests into account. Based on this formulation, we provide a new framework for cost-efficient online decision making which can utilize posterior sampling or BayesUCB for exploration. We provide a theoretical analysis of Thompson Sampling for cost-efficient online decision making, and present various experimental results that demonstrate the applicability of our framework to real-world problems.

View on arXiv PDF Code

Similar