LG AISep 13, 2021

Machine Learning for Online Algorithm Selection under Censored Feedback

Alexander Tornede, Viktor Bengs, Eyke Hüllermeier

arXiv:2109.06234v15.54 citationsh-index: 57Has Code

Originality Incremental advance

AI Analysis

This work addresses a specific challenge in algorithm selection for decision problems like SAT, where runtime data is often incomplete due to time limits, making it incremental by adapting existing bandit methods to handle censored feedback.

The paper tackles the problem of online algorithm selection under censored feedback, where algorithms are stopped after a time limit, leading to incomplete runtime data. The authors adapt multi-armed bandit methods to handle this censored data and show that Thompson sampling-based approaches perform strongly, improving over existing methods in experiments on an adapted ASlib benchmark.

In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms. For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime. As the latter is known to exhibit a heavy-tail distribution, an algorithm is normally stopped when exceeding a predefined upper time limit. As a consequence, machine learning methods used to optimize an algorithm selection strategy in a data-driven manner need to deal with right-censored samples, a problem that has received little attention in the literature so far. In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem. Moreover, we adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon. In an extensive experimental evaluation on an adapted version of the ASlib benchmark, we demonstrate that theoretically well-founded methods based on Thompson sampling perform specifically strong and improve in comparison to existing methods.

View on arXiv PDF Code

Similar