LGCRDSMay 20, 2025

A Private Approximation of the 2nd-Moment Matrix of Any Subsamplable Input

arXiv:2505.14251v1h-index: 19
Originality Incremental advance
AI Analysis

This work addresses the challenge of estimating second moments with differential privacy for data that may include outliers, which is important for privacy-preserving data analysis in fields like statistics and machine learning, though it builds incrementally on existing recursive frameworks.

The paper tackles the problem of differentially private second moment estimation by introducing a new algorithm that achieves strong privacy-utility trade-offs for worst-case inputs under subsamplability assumptions, preserving accuracy up to an arbitrary factor (1±γ) with high probability while abiding by zero-Concentrated Differential Privacy (zCDP).

We study the problem of differentially private second moment estimation and present a new algorithm that achieve strong privacy-utility trade-offs even for worst-case inputs under subsamplability assumptions on the data. We call an input $(m,α,β)$-subsamplable if a random subsample of size $m$ (or larger) preserves w.p $\geq 1-β$ the spectral structure of the original second moment matrix up to a multiplicative factor of $1\pm α$. Building upon subsamplability, we give a recursive algorithmic framework similar to Kamath et al 2019, that abides zero-Concentrated Differential Privacy (zCDP) while preserving w.h.p. the accuracy of the second moment estimation upto an arbitrary factor of $(1\pmγ)$. We then show how to apply our algorithm to approximate the second moment matrix of a distribution $\mathcal{D}$, even when a noticeable fraction of the input are outliers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes