LGAIMLFeb 15, 2024

Diffusion Models Meet Contextual Bandits

arXiv:2402.10028v36 citationsh-index: 6
AI Analysis

This work addresses computational and statistical inefficiencies in contextual bandits for applications requiring efficient online decision-making, representing an incremental improvement by integrating diffusion models into existing frameworks.

The paper tackled efficient online decision-making in contextual bandits by using pre-trained diffusion models as expressive priors to capture complex action dependencies, resulting in a practical algorithm that enables fast updates and sampling, with empirical results showing effectiveness across diverse settings.

Efficient online decision-making in contextual bandits is challenging, as methods without informative priors often suffer from computational or statistical inefficiencies. In this work, we leverage pre-trained diffusion models as expressive priors to capture complex action dependencies and develop a practical algorithm that efficiently approximates posteriors under such priors, enabling both fast updates and sampling. Empirical results demonstrate the effectiveness and versatility of our approach across diverse contextual bandit settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes