DS CR IT LGApr 16, 2024

Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages

Hilal Asi, Vitaly Feldman, Jelani Nelson, Huy L. Nguyen, Kunal Talwar, Samson Zhou

Apple

arXiv:2404.10201v25.98 citationsh-index: 43ICML

Originality Incremental advance

AI Analysis

This work addresses privacy-preserving data analysis for distributed systems, providing fundamental limits and efficient protocols, though it is incremental in refining existing shuffle model techniques.

The paper tackles private vector mean estimation in the shuffle model, proposing a multi-message protocol that achieves optimal error with $ ilde{\mathcal{O}}(\min(n\varepsilon^2,d))$ messages per user and proving a lower bound of $\Omega(\min(n\varepsilon^2,d)/\log(n))$ messages for optimal error, establishing near-optimal message complexity.

We study the problem of private vector mean estimation in the shuffle model of privacy where $n$ users each have a unit vector $v^{(i)} \in\mathbb{R}^d$. We propose a new multi-message protocol that achieves the optimal error using $\tilde{\mathcal{O}}\left(\min(n\varepsilon^2,d)\right)$ messages per user. Moreover, we show that any (unbiased) protocol that achieves optimal error requires each user to send $Ω(\min(n\varepsilon^2,d)/\log(n))$ messages, demonstrating the optimality of our message complexity up to logarithmic factors. Additionally, we study the single-message setting and design a protocol that achieves mean squared error $\mathcal{O}(dn^{d/(d+2)}\varepsilon^{-4/(d+2)})$. Moreover, we show that any single-message protocol must incur mean squared error $Ω(dn^{d/(d+2)})$, showing that our protocol is optimal in the standard setting where $\varepsilon = Θ(1)$. Finally, we study robustness to malicious users and show that malicious users can incur large additive error with a single shuffler.

View on arXiv PDF

Similar