Optimal Variance and Covariance Estimation under Differential Privacy in the Add-Remove Model and Beyond
This work addresses a challenging privacy-preserving statistical estimation problem for data analysts, but it is incremental as it builds on existing differential privacy frameworks.
The paper tackles the problem of estimating variance and covariance under differential privacy in the add-remove model, where dataset size must be kept private, by developing efficient mechanisms based on the Bézier mechanism, proving minimax optimality and showing improved utility over alternatives.
In this paper, we study the problem of estimating the variance and covariance of datasets under differential privacy in the add-remove model. While estimation in the swap model has been extensively studied in the literature, the add-remove model remains less explored and more challenging, as the dataset size must also be kept private. To address this issue, we develop efficient mechanisms for variance and covariance estimation based on the \emph{Bézier mechanism}, a novel moment-release framework that leverages Bernstein bases. We prove that our proposed mechanisms are minimax optimal in the high-privacy regime by establishing new minimax lower bounds. Moreover, beyond worst-case scenarios, we analyze instance-wise utility and show that the Bézier-based estimator consistently achieves better utility compared to alternative mechanisms. Finally, we demonstrate the effectiveness of the Bézier mechanism beyond variance and covariance estimation, showcasing its applicability to other statistical tasks.