LGMLNov 8, 2024

Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits

arXiv:2411.05979v24 citationsh-index: 3AISTATS
AI Analysis

This work addresses the challenge of efficient decision-making in contextual bandits for applications like recommendation systems, though it appears incremental as it builds on existing neural-UCB methods.

The paper tackles the problem of balancing exploration and exploitation in neural contextual bandits by proposing Neural-σ²-LinearUCB, a variance-aware algorithm that improves uncertainty quantification. The result is a theoretically better regret guarantee than other neural-UCB algorithms and empirical outperformance of state-of-the-art techniques with lower regret across synthetic, UCI, MNIST, and CIFAR-10 datasets.

By leveraging the representation power of deep neural networks, neural upper confidence bound (UCB) algorithms have shown success in contextual bandits. To further balance the exploration and exploitation, we propose Neural-$σ^2$-LinearUCB, a variance-aware algorithm that utilizes $σ^2_t$, i.e., an upper bound of the reward noise variance at round $t$, to enhance the uncertainty quantification quality of the UCB, resulting in a regret performance improvement. We provide an oracle version for our algorithm characterized by an oracle variance upper bound $σ^2_t$ and a practical version with a novel estimation for this variance bound. Theoretically, we provide rigorous regret analysis for both versions and prove that our oracle algorithm achieves a better regret guarantee than other neural-UCB algorithms in the neural contextual bandits setting. Empirically, our practical method enjoys a similar computational efficiency, while outperforming state-of-the-art techniques by having a better calibration and lower regret across multiple standard settings, including on the synthetic, UCI, MNIST, and CIFAR-10 datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes