LGMLAug 16, 2017

Corrupt Bandits for Preserving Local Privacy

arXiv:1708.05033v228 citations
AI Analysis

This work addresses privacy preservation in online recommender systems, but it is incremental as it adapts existing bandit methods to a corrupted setting with known parameters.

The paper tackles the stochastic multi-armed bandit problem with corrupted rewards to preserve local privacy in online recommender systems, providing lower bounds on regret, devising algorithms (KLUCB-CF and TS-CF) with upper bounds, and analyzing the impact of corruption parameters on regret, with experimental results confirming the analysis.

We study a variant of the stochastic multi-armed bandit (MAB) problem in which the rewards are corrupted. In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process with known parameters. We provide a lower bound on the expected regret of any bandit algorithm in this corrupted setting. We devise a frequentist algorithm, KLUCB-CF, and a Bayesian algorithm, TS-CF and give upper bounds on their regret. We also provide the appropriate corruption parameters to guarantee a desired level of local privacy and analyze how this impacts the regret. Finally, we present some experimental results that confirm our analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes