LG CR IT MLFeb 18, 2021

Robust and Differentially Private Mean Estimation

Xiyang Liu, Weihao Kong, Sham Kakade, Sewoong Oh

arXiv:2102.09159v225.085 citationsh-index: 67Has Code

Originality Highly original

AI Analysis

This addresses the critical need for secure and reliable data analysis in federated and meta-learning systems, where privacy and robustness are often handled separately, making it a foundational advancement.

The paper tackles the problem of estimating the mean from i.i.d. samples while ensuring both differential privacy and robustness to malicious data, introducing PRIME as the first efficient algorithm for this dual guarantee across a wide range of distributions.

In statistical learning and analysis from shared data, which is increasingly widely adopted in platforms such as federated learning and meta-learning, there are two major concerns: privacy and robustness. Each participating individual should be able to contribute without the fear of leaking one's sensitive information. At the same time, the system should be robust in the presence of malicious participants inserting corrupted data. Recent algorithmic advances in learning from shared data focus on either one of these threats, leaving the system vulnerable to the other. We bridge this gap for the canonical problem of estimating the mean from i.i.d. samples. We introduce PRIME, which is the first efficient algorithm that achieves both privacy and robustness for a wide range of distributions. We further complement this result with a novel exponential time algorithm that improves the sample complexity of PRIME, achieving a near-optimal guarantee and matching a known lower bound for (non-robust) private mean estimation. This proves that there is no extra statistical cost to simultaneously guaranteeing privacy and robustness.

View on arXiv PDF Code

Similar