MLLGNov 23, 2020

Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits

arXiv:2011.11222v233 citations
AI Analysis

This work addresses the challenge of confidence estimation in logistic bandits, offering incremental improvements for researchers in machine learning and optimization.

The paper tackles the problem of deriving confidence bounds for the linear logistic model, improving upon prior work by eliminating exponential dependence on a worst-case parameter and instead using arm-specific variance, with applications to logistic bandits that enhance performance guarantees.

We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bound by Li et al. (2017) via recent developments of the self-concordant analysis of the logistic loss (Faury et al., 2020). Specifically, our confidence bound avoids a direct dependence on $1/κ$, where $κ$ is the minimal variance over all arms' reward distributions. In general, $1/κ$ scales exponentially with the norm of the unknown linear parameter $θ^*$. Instead of relying on this worst-case quantity, our confidence bound for the reward of any given arm depends directly on the variance of that arm's reward distribution. We present two applications of our novel bounds to pure exploration and regret minimization logistic bandits improving upon state-of-the-art performance guarantees. For pure exploration, we also provide a lower bound highlighting a dependence on $1/κ$ for a family of instances.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes