LGMEMLFeb 1, 2025

High-Dimensional Linear Bandits under Stochastic Latent Heterogeneity

arXiv:2502.00423v22 citationsh-index: 5
Originality Highly original
AI Analysis

This addresses a fundamental barrier in online decision-making for applications like promotion targeting, revealing inherent limitations when latent subgroups are present.

The paper tackles the problem of online decision-making with stochastic latent heterogeneity, where unobserved subgroups affect responses to actions, and shows that while strong regret grows linearly due to irreducible classification uncertainty, regular regret achieves a minimax-optimal sublinear rate.

This paper addresses the critical challenge of stochastic latent heterogeneity in online decision-making, where individuals' responses to actions vary not only with observable contexts but also with unobserved, randomly realized subgroups. Existing data-driven approaches largely capture observable heterogeneity through contextual features but fail when the sources of variation are latent and stochastic. We propose a latent heterogeneous bandit framework that explicitly models probabilistic subgroup membership and group-specific reward functions, using promotion targeting as a motivating example. Our phased EM-greedy algorithm jointly learns latent group probabilities and reward parameters in high dimensions, achieving optimal estimation and classification guarantees. Our analysis reveals a new phenomenon unique to decision-making with stochastic latent subgroups: randomness in group realizations creates irreducible classification uncertainty, making sub-linear regret against a fully informed strong oracle fundamentally impossible. We establish matching upper and minimax lower bounds for both the strong and regular regrets, corresponding, respectively, to oracles with and without access to realized group memberships. The strong regret necessarily grows linearly, while the regular regret achieves a minimax-optimal sublinear rate. These findings uncover a fundamental stochastic barrier in online decision-making and point to potential remedies through simple strategic interventions and mechanism-design-based elicitation of latent information.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes