MLLGMay 22, 2019

An Optimal Private Stochastic-MAB Algorithm Based on an Optimal Private Stopping Rule

arXiv:1905.09383v155 citations
Originality Highly original
AI Analysis

This provides a provably optimal solution for privacy-preserving decision-making in bandit problems, addressing a gap in prior work.

The authors tackled the problem of designing a differentially private algorithm for the stochastic multi-arm bandit that meets known lower bounds, achieving optimal regret of O(K log(T)/ε) as proven theoretically and demonstrated empirically.

We present a provably optimal differentially private algorithm for the stochastic multi-arm bandit problem, as opposed to the private analogue of the UCB-algorithm [Mishra and Thakurta, 2015; Tossou and Dimitrakakis, 2016] which doesn't meet the recently discovered lower-bound of $Ω\left(\frac{K\log(T)}ε \right)$ [Shariff and Sheffet, 2018]. Our construction is based on a different algorithm, Successive Elimination [Even-Dar et al. 2002], that repeatedly pulls all remaining arms until an arm is found to be suboptimal and is then eliminated. In order to devise a private analogue of Successive Elimination we visit the problem of private stopping rule, that takes as input a stream of i.i.d samples from an unknown distribution and returns a multiplicative $(1 \pm α)$-approximation of the distribution's mean, and prove the optimality of our private stopping rule. We then present the private Successive Elimination algorithm which meets both the non-private lower bound [Lai and Robbins, 1985] and the above-mentioned private lower bound. We also compare empirically the performance of our algorithm with the private UCB algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes