CRLOPLOct 9, 2019

Deciding Differential Privacy for Programs with Finite Inputs and Outputs

arXiv:1910.04137v227 citations
Originality Highly original
AI Analysis

This work addresses the need for automated and complete formal verification of differential privacy in programs, which is crucial for ensuring privacy guarantees in data analysis applications.

The authors tackled the problem of verifying differential privacy for probabilistic programs by proposing the first decision procedure that can prove privacy for all possible privacy budgets or generate counterexamples, applying to both ε-differential privacy and (ε,δ)-differential privacy. They implemented the procedure and used it to (dis)prove privacy bounds for well-known examples like randomized response and report noisy max.

Differential privacy is a de facto standard for statistical computations over databases that contain private data. The strength of differential privacy lies in a rigorous mathematical definition that guarantees individual privacy and yet allows for accurate statistical results. Thanks to its mathematical definition, differential privacy is also a natural target for formal analysis. A broad line of work uses logical methods for proving privacy. However, these methods are not complete, and only partially automated. A recent and complementary line of work uses statistical methods for finding privacy violations. However, the methods only provide statistical guarantees (but no proofs). We propose the first decision procedure for checking the differential privacy of a non-trivial class of probabilistic computations. Our procedure takes as input a program P parametrized by a privacy budget $ε$, and either proves differential privacy for all possible values of $ε$ or generates a counterexample. In addition, our procedure applies both to $ε$-differential privacy and $(ε,δ)$-differential privacy. Technically, the decision procedure is based on a novel and judicious encoding of the semantics of programs in our class into a decidable fragment of the first-order theory of the reals with exponentiation. We implement our procedure and use it for (dis)proving privacy bounds for many well-known examples, including randomized response, histogram, report noisy max and sparse vector.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes