Construction of Differentially Private Empirical Distributions from a low-order Marginals Set through Solving Linear Equations with l2 Regularization
This work addresses the challenge of efficient and private data distribution construction for data analysts, though it appears incremental as it builds on existing differential privacy methods with a focus on computational improvements.
The paper tackles the problem of generating differentially private empirical joint distributions from low-order marginals, introducing the CIPHER algorithm that reduces computational storage and memory requirements compared to full-dimensional histogram sanitization. Experiments show CIPHER outperforms multiplicative weighting exponential mechanism in preserving information and offers similar or superior utility to FDH sanitization at the same privacy budget.
We introduce a new algorithm, Construction of dIfferentially Private Empirical Distributions from a low-order marginals set tHrough solving linear Equations with l2 Regularization (CIPHER), that produces differentially private empirical joint distributions from a set of low-order marginals. CIPHER is conceptually simple and requires no more than decomposing joint probabilities via basic probability rules to construct a linear equation set and subsequently solving the equations. Compared to the full-dimensional histogram (FDH) sanitization, CIPHER has drastic\-ally lower requirements on computational storage and memory, which is practically attractive especially considering that the high-order signals preserved by the FDH sanitization are likely just sample randomness and rarely of interest. Our experiments demonstrate that CIPHER outperforms the multiplicative weighting exponential mechanism in preserving original information and has similar or superior cost-normalized utility to FDH sanitization at the same privacy budget.