LGMLJan 27, 2017

The Price of Differential Privacy For Online Learning

arXiv:1701.07953v2106 citations
Originality Highly original
AI Analysis

This work addresses privacy concerns in online learning for applications like recommendation systems, providing significant improvements in regret bounds for bandit settings.

The paper tackles the problem of online linear optimization with differential privacy, showing that in the full-information setting, privacy can be ensured for free with O(√T) regret, while in the bandit setting, it achieves a regret of Õ(1/ε √T), improving from the previous best of Õ(1/ε T^{2/3}).

We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $\tilde{O}(\sqrt{T})$ regret bounds. In the full-information setting, our results demonstrate that $ε$-differential privacy may be ensured for free -- in particular, the regret bounds scale as $O(\sqrt{T})+\tilde{O}\left(\frac{1}ε\right)$. For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of $\tilde{O}\left(\frac{1}ε\sqrt{T}\right)$, while the previously known best regret bound was $\tilde{O}\left(\frac{1}εT^{\frac{2}{3}}\right)$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes