MLAICRLGOct 29, 2024

Privacy-Preserving Dynamic Assortment Selection

arXiv:2410.22488v11 citationsh-index: 1
Originality Highly original
AI Analysis

It addresses privacy risks in dynamic recommendation systems for users and platforms, offering a novel method with theoretical guarantees, though it builds upon existing bandit models.

The paper tackles the problem of personalized assortment recommendations with privacy concerns by proposing a privacy-preserving dynamic assortment selection framework using multinomial logit bandits, achieving a near-optimal regret bound of O(√T) and demonstrating performance improvements in simulations and on the Expedia hotel dataset.

With the growing demand for personalized assortment recommendations, concerns over data privacy have intensified, highlighting the urgent need for effective privacy-preserving strategies. This paper presents a novel framework for privacy-preserving dynamic assortment selection using the multinomial logit (MNL) bandits model. Our approach employs a perturbed upper confidence bound method, integrating calibrated noise into user utility estimates to balance between exploration and exploitation while ensuring robust privacy protection. We rigorously prove that our policy satisfies Joint Differential Privacy (JDP), which better suits dynamic environments than traditional differential privacy, effectively mitigating inference attack risks. This analysis is built upon a novel objective perturbation technique tailored for MNL bandits, which is also of independent interest. Theoretically, we derive a near-optimal regret bound of $\tilde{O}(\sqrt{T})$ for our policy and explicitly quantify how privacy protection impacts regret. Through extensive simulations and an application to the Expedia hotel dataset, we demonstrate substantial performance enhancements over the benchmark method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes