LGCRDCFeb 9, 2023

On the Privacy-Robustness-Utility Trilemma in Distributed Learning

arXiv:2302.04787v234 citationsh-index: 70
AI Analysis

This addresses the challenge of designing secure distributed learning algorithms for sensitive applications, though it is incremental as it builds on prior work in privacy and robustness.

The paper tackles the problem of simultaneously ensuring privacy, robustness, and utility in distributed machine learning, proving a fundamental trade-off and presenting an algorithm with matching error bounds for mean estimation.

The ubiquity of distributed machine learning (ML) in sensitive public domain applications calls for algorithms that protect data privacy, while being robust to faults and adversarial behaviors. Although privacy and robustness have been extensively studied independently in distributed ML, their synthesis remains poorly understood. We present the first tight analysis of the error incurred by any algorithm ensuring robustness against a fraction of adversarial machines, as well as differential privacy (DP) for honest machines' data against any other curious entity. Our analysis exhibits a fundamental trade-off between privacy, robustness, and utility. To prove our lower bound, we consider the case of mean estimation, subject to distributed DP and robustness constraints, and devise reductions to centralized estimation of one-way marginals. We prove our matching upper bound by presenting a new distributed ML algorithm using a high-dimensional robust aggregation rule. The latter amortizes the dependence on the dimension in the error (caused by adversarial workers and DP), while being agnostic to the statistical properties of the data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes