DS CR LG STOct 22, 2021

Tight and Robust Private Mean Estimation with Few Users

Hossein Esfandiari, Vahab Mirrokni, Shyam Narayanan

arXiv:2110.11876v213.435 citations

Originality Highly original

AI Analysis

This resolves a problem in privacy-preserving machine learning by enabling efficient mean estimation with fewer users, impacting applications like private learning of distributions and stochastic optimization.

The paper tackles high-dimensional mean estimation under user-level differential privacy, achieving a nearly optimal trade-off between the number of users and samples per user, with the number of users as low as O(1/ε log 1/δ), independent of dimension, and robustness against up to 49% corruptions.

In this work, we study high-dimensional mean estimation under user-level differential privacy, and design an $(\varepsilon,δ)$-differentially private mechanism using as few users as possible. In particular, we provide a nearly optimal trade-off between the number of users and the number of samples per user required for private mean estimation, even when the number of users is as low as $O(\frac{1}{\varepsilon}\log\frac{1}δ)$. Interestingly, this bound on the number of \emph{users} is independent of the dimension (though the number of \emph{samples per user} is allowed to depend polynomially on the dimension), unlike the previous work that requires the number of users to depend polynomially on the dimension. This resolves a problem first proposed by Amin et al. Moreover, our mechanism is robust against corruptions in up to $49\%$ of the users. Finally, our results also apply to optimal algorithms for privately learning discrete distributions with few users, answering a question of Liu et al., and a broader range of problems such as stochastic convex optimization and a variant of stochastic gradient descent via a reduction to differentially private mean estimation.

View on arXiv PDF

Similar