CRApr 1

Preserving Target Distributions With Differentially Private Count Mechanisms

arXiv:2604.0146832.2h-index: 7

AI Analysis

This work addresses the challenge of preserving target distributions in privacy-sensitive data publishing for researchers and practitioners, representing an incremental improvement with a novel method for a known bottleneck.

The paper tackles the problem of statistical biases in differentially private count tables by introducing a two-stage framework that balances distribution accuracy with count accuracy and runtime, resulting in a new cyclic Laplace mechanism that outperforms existing histogram mechanisms.

Differentially private mechanisms are increasingly used to publish tables of counts, where each entry represents the number of individuals belonging to a particular category. A distribution of counts summarizes the information in the count column, unlinking counts from categories. This object is useful for answering a class of research questions, but it is subject to statistical biases when counts are privatized with standard mechanisms. This motivates a novel design criterion we term accuracy of distribution. This study formalizes a two-stage framework for privatizing tables of counts that balances accuracy of distribution with two standard criteria of accuracy of counts and runtime. In the first stage, a distribution privatizer generates an estimate for the true distribution of counts. We introduce a new mechanism, called the cyclic Laplace, specifically tailored to distributions of counts, that outperforms existing general-purpose differentially private histogram mechanisms. In the second stage, a constructor algorithm generates a count mechanism, represented as a transition matrix, whose fixed-point is the privatized distribution of counts. We develop a mathematical theory that describes such transition matrices in terms of simple building blocks we call epsilon-scales. This theory informs the design of a new constructor algorithm that generates transition matrices with favorable properties more efficiently than standard optimization algorithms. We explore the practicality of our framework with a set of experiments, highlighting situations in which a fixed-point method provides a favorable tradeoff among performance criteria.

View on arXiv PDF

Similar