DSCRITLGApr 16, 2024

Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages

Apple
arXiv:2404.10201v28 citationsh-index: 43ICML
Originality Incremental advance
AI Analysis

This work addresses privacy-preserving data analysis for distributed systems, providing fundamental limits and efficient protocols, though it is incremental in refining existing shuffle model techniques.

The paper tackles private vector mean estimation in the shuffle model, proposing a multi-message protocol that achieves optimal error with $ ilde{\mathcal{O}}(\min(n\varepsilon^2,d))$ messages per user and proving a lower bound of $\Omega(\min(n\varepsilon^2,d)/\log(n))$ messages for optimal error, establishing near-optimal message complexity.

We study the problem of private vector mean estimation in the shuffle model of privacy where $n$ users each have a unit vector $v^{(i)} \in\mathbb{R}^d$. We propose a new multi-message protocol that achieves the optimal error using $\tilde{\mathcal{O}}\left(\min(n\varepsilon^2,d)\right)$ messages per user. Moreover, we show that any (unbiased) protocol that achieves optimal error requires each user to send $Ω(\min(n\varepsilon^2,d)/\log(n))$ messages, demonstrating the optimality of our message complexity up to logarithmic factors. Additionally, we study the single-message setting and design a protocol that achieves mean squared error $\mathcal{O}(dn^{d/(d+2)}\varepsilon^{-4/(d+2)})$. Moreover, we show that any single-message protocol must incur mean squared error $Ω(dn^{d/(d+2)})$, showing that our protocol is optimal in the standard setting where $\varepsilon = Θ(1)$. Finally, we study robustness to malicious users and show that malicious users can incur large additive error with a single shuffler.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes