CRMay 10, 2019

Practical Differentially Private Top-$k$ Selection with Pay-what-you-get Composition

arXiv:1905.04273v299 citations
Originality Highly original
AI Analysis

This work addresses the challenge of making differential privacy practical for real-world applications where data domains are unknown, offering a solution for scenarios like frequent itemset mining without prior domain specification.

The paper tackles the problem of selecting top-k items from a large, unknown data domain under user-level differential privacy, developing algorithms that only require access to the true top elements without domain knowledge and achieve approximate privacy with a pay-what-you-get composition bound.

We study the problem of top-$k$ selection over a large domain universe subject to user-level differential privacy. Typically, the exponential mechanism or report noisy max are the algorithms used to solve this problem. However, these algorithms require querying the database for the count of each domain element. We focus on the setting where the data domain is unknown, which is different than the setting of frequent itemsets where an apriori type algorithm can help prune the space of domain elements to query. We design algorithms that ensures (approximate) $(ε,δ>0)$-differential privacy and only needs access to the true top-$\bar{k}$ elements from the data for any chosen $\bar{k} \geq k$. This is a highly desirable feature for making differential privacy practical, since the algorithms require no knowledge of the domain. We consider both the setting where a user's data can modify an arbitrary number of counts by at most 1, i.e. unrestricted sensitivity, and the setting where a user's data can modify at most some small, fixed number of counts by at most 1, i.e. restricted sensitivity. Additionally, we provide a pay-what-you-get privacy composition bound for our algorithms. That is, our algorithms might return fewer than $k$ elements when the top-$k$ elements are queried, but the overall privacy budget only decreases by the size of the outcome set.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes