Privately Learning High-Dimensional Distributions
This work addresses privacy concerns in statistical learning for high-dimensional data, offering efficient algorithms with minimal performance loss, though it is incremental as it builds on prior private learning methods.
The paper tackles the problem of learning high-dimensional distributions (multivariate Gaussians and product distributions over Boolean hypercubes) with differential privacy, achieving sample complexities that nearly match optimal non-private learners, showing privacy comes essentially for free in these cases.
We present novel, computationally efficient, and differentially private algorithms for two fundamental high-dimensional learning problems: learning a multivariate Gaussian and learning a product distribution over the Boolean hypercube in total variation distance. The sample complexity of our algorithms nearly matches the sample complexity of the optimal non-private learners for these tasks in a wide range of parameters, showing that privacy comes essentially for free for these problems. In particular, in contrast to previous approaches, our algorithm for learning Gaussians does not require strong a priori bounds on the range of the parameters. Our algorithms introduce a novel technical approach to reducing the sensitivity of the estimation procedure that we call recursive private preconditioning.