THFeb 11, 2022
Information Design for Differential PrivacyIan M. Schmutte, Nathan Yoder
Firms and statistical agencies must protect the privacy of the individuals whose data they collect, analyze, and publish. Increasingly, these organizations do so by using publication mechanisms that satisfy differential privacy. We consider the problem of choosing such a mechanism so as to maximize the value of its output to end users. We show that mechanisms which add noise to the statistic of interest--like most of those used in practice--are generally not optimal when the statistic is a sum or average of magnitude data (e.g., income). However, we also show that adding noise is always optimal when the statistic is a count of data entries with a certain characteristic, and the underlying database is drawn from a symmetric distribution (e.g., if individuals' data are i.i.d.). When, in addition, data users have supermodular payoffs, we show that the simple geometric mechanism is always optimal by using a novel comparative static that ranks information structures according to their usefulness in supermodular decision problems.
THJun 21, 2019
Suboptimal Provision of Privacy and Statistical Accuracy When They are Public GoodsJohn M. Abowd, Ian M. Schmutte, William Sexton et al.
With vast databases at their disposal, private tech companies can compete with public statistical agencies to provide population statistics. However, private companies face different incentives to provide high-quality statistics and to protect the privacy of the people whose data are used. When both privacy protection and statistical accuracy are public goods, private providers tend to produce at least one suboptimally, but it is not clear which. We model a firm that publishes statistics under a guarantee of differential privacy. We prove that provision by the private firm results in inefficiently low data quality in this framework.
CRAug 20, 2018
An Economic Analysis of Privacy Protection and Statistical Accuracy as Social ChoicesJohn M. Abowd, Ian M. Schmutte
Statistical agencies face a dual mandate to publish accurate statistics while protecting respondent privacy. Increasing privacy protection requires decreased accuracy. Recognizing this as a resource allocation problem, we propose an economic solution: operate where the marginal cost of increasing privacy equals the marginal benefit. Our model of production, from computer science, assumes data are published using an efficient differentially private algorithm. Optimal choice weighs the demand for accurate statistics against the demand for privacy. Examples from U.S.\ statistical programs show how our framework can guide decision-making. Further progress requires a better understanding of willingness-to-pay for privacy and statistical accuracy.