MLCRLGSTMEApr 1, 2021

High-Dimensional Differentially-Private EM Algorithm: Methods and Near-Optimal Statistical Guarantees

arXiv:2104.00245v23 citations
AI Analysis

This work addresses privacy-preserving statistical estimation for high-dimensional data, representing an incremental advancement by extending existing methods to new models with theoretical guarantees.

The authors tackled the problem of applying differential privacy to expectation-maximization algorithms in high-dimensional latent variable models, achieving near-optimal convergence rates with privacy constraints, as demonstrated in Gaussian mixture, mixture of regression, and regression with missing covariates models.

In this paper, we develop a general framework to design differentially private expectation-maximization (EM) algorithms in high-dimensional latent variable models, based on the noisy iterative hard-thresholding. We derive the statistical guarantees of the proposed framework and apply it to three specific models: Gaussian mixture, mixture of regression, and regression with missing covariates. In each model, we establish the near-optimal rate of convergence with differential privacy constraints, and show the proposed algorithm is minimax rate optimal up to logarithm factors. The technical tools developed for the high-dimensional setting are then extended to the classic low-dimensional latent variable models, and we propose a near rate-optimal EM algorithm with differential privacy guarantees in this setting. Simulation studies and real data analysis are conducted to support our results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes