Dynamic Global Sensitivity for Differentially Private Contextual Bandits
This addresses privacy concerns for users of interactive recommendation systems, though it appears to be an incremental improvement over existing differentially private bandit methods.
The authors tackled the privacy problem in contextual bandit algorithms for interactive recommendation by proposing a differentially private algorithm that uses dynamic global sensitivity analysis to reduce noise injection. Their approach achieves (ε, δ)-differential privacy with added regret in Õ(log T√T/ε) and shows experimental advantages over existing solutions.
Bandit algorithms have become a reference solution for interactive recommendation. However, as such algorithms directly interact with users for improved recommendations, serious privacy concerns have been raised regarding its practical use. In this work, we propose a differentially private linear contextual bandit algorithm, via a tree-based mechanism to add Laplace or Gaussian noise to model parameters. Our key insight is that as the model converges during online update, the global sensitivity of its parameters shrinks over time (thus named dynamic global sensitivity). Compared with existing solutions, our dynamic global sensitivity analysis allows us to inject less noise to obtain $(ε, δ)$-differential privacy with added regret caused by noise injection in $\tilde O(\log{T}\sqrt{T}/ε)$. We provide a rigorous theoretical analysis over the amount of noise added via dynamic global sensitivity and the corresponding upper regret bound of our proposed algorithm. Experimental results on both synthetic and real-world datasets confirmed the algorithm's advantage against existing solutions.