CRDBITFeb 14, 2014

A Minimax Distortion View of Differentially Private Query Release

arXiv:1402.3384v23 citations
Originality Highly original
AI Analysis

This work addresses the challenge of providing accurate and private query answers without pre-specifying query sets, which is incremental but offers specific gains for database privacy applications.

The paper tackles the problem of differentially private query release by proposing a query-set independent mechanism that encodes stochastic structure into synthetic databases and uses companion estimators to provide accurate answers for all queries in a general class, achieving a minimax distortion of O(1/n) with squared-error measure and showing experimental improvements over the state-of-the-art MWEM algorithm on real datasets.

We consider the problem of differentially private query release through a synthetic database approach. Departing from the existing approaches that require the query set to be specified in advance, we advocate to devise query-set independent mechanisms, with an ambitious goal of providing accurate answers, while meeting the privacy constraints, for all queries in a general query class. Specifically, a differentially private mechanism is constructed to "encode" rich stochastic structure into the synthetic database, and "customized" companion estimators are then derived to provide accurate answers by making use of all available information, including the mechanism (which is public information) and the query functions. Accordingly, the distortion under the best of this kind of mechanisms at the worst-case query in a general query class, so called the minimax distortion, provides a fundamental characterization of differentially private query release. For the general class of statistical queries, we prove that with the squared-error distortion measure, the minimax distortion is $O(1/n)$ by deriving asymptotically tight upper and lower bounds in the regime that the database size $n$ goes to infinity. The upper bound is achievable by a mechanism $\mathcal{E}$ and its corresponding companion estimators, which points directly to the feasibility of the proposed approach in large databases. We further evaluate the mechanism $\mathcal{E}$ and the companion estimators through experiments on real datasets from Netflix and Facebook. Experimental results show improvement over the state-of-art MWEM algorithm and verify the scaling behavior $O(1/n)$ of the minimax distortion.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes