LGMLMar 31, 2024

Meta Learning in Bandits within Shared Affine Subspaces

arXiv:2404.00688v14 citationsh-index: 19AISTATS
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficiently learning across multiple bandit tasks for applications in recommendation systems or adaptive decision-making, though it appears incremental as it builds on existing meta-learning frameworks.

The paper tackles the problem of meta-learning multiple contextual bandit tasks by modeling them as concentrated around a low-dimensional affine subspace, using online PCA to reduce expected regret, with empirical results showing significant regret reduction.

We study the problem of meta-learning several contextual stochastic bandits tasks by leveraging their concentration around a low-dimensional affine subspace, which we learn via online principal component analysis to reduce the expected regret over the encountered bandits. We propose and theoretically analyze two strategies that solve the problem: One based on the principle of optimism in the face of uncertainty and the other via Thompson sampling. Our framework is generic and includes previously proposed approaches as special cases. Besides, the empirical results show that our methods significantly reduce the regret on several bandit tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes