LGAIMLMar 30, 2023

Contextual Combinatorial Bandits with Probabilistically Triggered Arms

UW
arXiv:2303.17110v328 citationsh-index: 64
Originality Incremental advance
AI Analysis

This work addresses a problem in online learning and decision-making for applications like recommendation systems, with incremental improvements over existing methods.

The paper tackles contextual combinatorial bandits with probabilistically triggered arms by proposing new algorithms and analysis to achieve improved regret bounds, such as removing an exponential factor and making bounds independent of batch-size, with experiments showing better performance on synthetic and real-world datasets.

We study contextual combinatorial bandits with probabilistically triggered arms (C$^2$MAB-T) under a variety of smoothness conditions that capture a wide range of applications, such as contextual cascading bandits and contextual influence maximization bandits. Under the triggering probability modulated (TPM) condition, we devise the C$^2$-UCB-T algorithm and propose a novel analysis that achieves an $\tilde{O}(d\sqrt{KT})$ regret bound, removing a potentially exponentially large factor $O(1/p_{\min})$, where $d$ is the dimension of contexts, $p_{\min}$ is the minimum positive probability that any arm can be triggered, and batch-size $K$ is the maximum number of arms that can be triggered per round. Under the variance modulated (VM) or triggering probability and variance modulated (TPVM) conditions, we propose a new variance-adaptive algorithm VAC$^2$-UCB and derive a regret bound $\tilde{O}(d\sqrt{T})$, which is independent of the batch-size $K$. As a valuable by-product, our analysis technique and variance-adaptive algorithm can be applied to the CMAB-T and C$^2$MAB setting, improving existing results there as well. We also include experiments that demonstrate the improved performance of our algorithms compared with benchmark algorithms on synthetic and real-world datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes