CRApr 29, 2021
On Linear Time Decidability of Differential Privacy for Programs with Unbounded InputsRohit Chadha, A. Prasad Sistla, Mahesh Viswanathan
We introduce an automata model for describing interesting classes of differential privacy mechanisms/algorithms that include known mechanisms from the literature. These automata can model algorithms whose inputs can be an unbounded sequence of real-valued query answers. We consider the problem of checking whether there exists a constant $d$ such that the algorithm described by these automata are $dε$-differentially private for all positive values of the privacy budget parameter $ε$. We show that this problem can be decided in time linear in the automaton's size by identifying a necessary and sufficient condition on the underlying graph of the automaton. This paper's results are the first decidability results known for algorithms with an unbounded number of query answers taking values from the set of reals.
CRNov 12, 2020
Deciding Accuracy of Differential Privacy SchemesGilles Barthe, Rohit Chadha, Paul Krogmeier et al.
Differential privacy is a mathematical framework for developing statistical computations with provable guarantees of privacy and accuracy. In contrast to the privacy component of differential privacy, which has a clear mathematical and intuitive meaning, the accuracy component of differential privacy does not have a generally accepted definition; accuracy claims of differential privacy algorithms vary from algorithm to algorithm and are not instantiations of a general definition. We identify program discontinuity as a common theme in existing \emph{ad hoc} definitions and introduce an alternative notion of accuracy parametrized by, what we call, {\distance} -- the {\distance} of an input $x$ w.r.t., a deterministic computation $f$ and a distance $d$, is the minimal distance $d(x,y)$ over all $y$ such that $f(y)\neq f(x)$. We show that our notion of accuracy subsumes the definition used in theoretical computer science, and captures known accuracy claims for differential privacy algorithms. In fact, our general notion of accuracy helps us prove better claims in some cases. Next, we study the decidability of accuracy. We first show that accuracy is in general undecidable. Then, we define a non-trivial class of probabilistic computations for which accuracy is decidable (unconditionally, or assuming Schanuel's conjecture). We implement our decision procedure and experimentally evaluate the effectiveness of our approach for generating proofs or counterexamples of accuracy for common algorithms from the literature.
CROct 9, 2019
Deciding Differential Privacy for Programs with Finite Inputs and OutputsGilles Barthe, Rohit Chadha, Vishal Jagannath et al.
Differential privacy is a de facto standard for statistical computations over databases that contain private data. The strength of differential privacy lies in a rigorous mathematical definition that guarantees individual privacy and yet allows for accurate statistical results. Thanks to its mathematical definition, differential privacy is also a natural target for formal analysis. A broad line of work uses logical methods for proving privacy. However, these methods are not complete, and only partially automated. A recent and complementary line of work uses statistical methods for finding privacy violations. However, the methods only provide statistical guarantees (but no proofs). We propose the first decision procedure for checking the differential privacy of a non-trivial class of probabilistic computations. Our procedure takes as input a program P parametrized by a privacy budget $ε$, and either proves differential privacy for all possible values of $ε$ or generates a counterexample. In addition, our procedure applies both to $ε$-differential privacy and $(ε,δ)$-differential privacy. Technically, the decision procedure is based on a novel and judicious encoding of the semantics of programs in our class into a decidable fragment of the first-order theory of the reals with exponentiation. We implement our procedure and use it for (dis)proving privacy bounds for many well-known examples, including randomized response, histogram, report noisy max and sparse vector.