LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding Linear Bandit Problem
This work addresses a specific sequential decision-making problem in stochastic bandits, offering an incremental improvement for researchers and practitioners in optimization under resource constraints.
The paper tackled the Thresholding Linear Bandit problem by developing LinearAPT, an adaptive algorithm for fixed-budget settings, which achieved a theoretical upper bound for estimated loss and demonstrated robust performance on synthetic and real-world datasets.
In this study, we delve into the Thresholding Linear Bandit (TLB) problem, a nuanced domain within stochastic Multi-Armed Bandit (MAB) problems, focusing on maximizing decision accuracy against a linearly defined threshold under resource constraints. We present LinearAPT, a novel algorithm designed for the fixed budget setting of TLB, providing an efficient solution to optimize sequential decision-making. This algorithm not only offers a theoretical upper bound for estimated loss but also showcases robust performance on both synthetic and real-world datasets. Our contributions highlight the adaptability, simplicity, and computational efficiency of LinearAPT, making it a valuable addition to the toolkit for addressing complex sequential decision-making challenges.