DSCRLGMLMay 1, 2018

Privately Learning High-Dimensional Distributions

arXiv:1805.00216v3170 citations
Originality Highly original
AI Analysis

This work addresses privacy concerns in statistical learning for high-dimensional data, offering efficient algorithms with minimal performance loss, though it is incremental as it builds on prior private learning methods.

The paper tackles the problem of learning high-dimensional distributions (multivariate Gaussians and product distributions over Boolean hypercubes) with differential privacy, achieving sample complexities that nearly match optimal non-private learners, showing privacy comes essentially for free in these cases.

We present novel, computationally efficient, and differentially private algorithms for two fundamental high-dimensional learning problems: learning a multivariate Gaussian and learning a product distribution over the Boolean hypercube in total variation distance. The sample complexity of our algorithms nearly matches the sample complexity of the optimal non-private learners for these tasks in a wide range of parameters, showing that privacy comes essentially for free for these problems. In particular, in contrast to previous approaches, our algorithm for learning Gaussians does not require strong a priori bounds on the range of the parameters. Our algorithms introduce a novel technical approach to reducing the sensitivity of the estimation procedure that we call recursive private preconditioning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes