ML LGApr 27

Shuffle and Joint Differential Privacy for Generalized Linear Contextual Bandits

arXiv:2602.0041727.1h-index: 3

AI Analysis

This work extends private contextual bandits from linear to generalized linear models, addressing a key bottleneck for practitioners who need privacy guarantees in more complex reward settings.

This paper presents the first algorithms for generalized linear contextual bandits under shuffle and joint differential privacy, achieving regret bounds that nearly match non-private rates. For stochastic contexts, the shuffle-DP algorithm achieves regret $ ilde{O}(d^{3/2}\sqrt{T \log T}/\sqrt{\varepsilon})$, and for adversarial contexts, the joint-DP algorithm achieves regret $ ilde{O}(d\sqrt{T} \log T + d^{3/4}\sqrt{T/\varepsilon}(\log T)(d+\log T)^{1/4})$.

We present the first algorithms for generalized linear contextual bandits under shuffle differential privacy and joint differential privacy. While prior work on private contextual bandits has been restricted to linear reward models -- which admit closed-form estimators -- generalized linear models (GLMs) pose fundamental new challenges: no closed-form estimator exists, requiring private convex optimization; privacy must be tracked across multiple evolving design matrices; and optimization error must be explicitly incorporated into regret analysis. We address these challenges under two privacy models and context settings. For stochastic contexts, we design a shuffle-DP algorithm achieving $\tilde{O}(d^{3/2}\sqrt{T \log T}/\sqrt{\varepsilon})$ regret in dominant term, differing from the non-private rate by a factor of $\sqrt{d/\varepsilon}$. For adversarial contexts, we provide a joint-DP algorithm with regret $\tilde{O}\!\big(d\sqrt{T} \log T + d^{3/4}\sqrt{T/\varepsilon}\,(\log T)\,(d + \log T)^{1/4}\big)$ -- matching the non-private rate $\tilde{O}(d\sqrt{T} \log T)$ in the leading term, with privacy contributing only an additive correction. Unlike prior work on locally private GLM bandits, our methods require no spectral assumptions on the context distribution beyond $\ell_2$ boundedness.

View on arXiv PDF

Similar