Boris Škorić

CR
4papers
19citations
Novelty49%
AI Score22

4 Papers

CRAug 30, 2020
Data Sanitisation Protocols for the Privacy Funnel with Differential Privacy Guarantees

Milan Lopuhaä-Zwakenberg, Haochen Tong, Boris Škorić

In the Open Data approach, governments and other public organisations want to share their datasets with the public, for accountability and to support participation. Data must be opened in such a way that individual privacy is safeguarded. The Privacy Funnel is a mathematical approach that produces a sanitised database that does not leak private data beyond a chosen threshold. The downsides to this approach are that it does not give worst-case privacy guarantees, and that finding optimal sanitisation protocols can be computationally prohibitive. We tackle these problems by using differential privacy metrics, and by considering local protocols which operate on one entry at a time. We show that under both the Local Differential Privacy and Local Information Privacy leakage metrics, one can efficiently obtain optimal protocols. Furthermore, Local Information Privacy is both more closely aligned to the privacy requirements of the Privacy Funnel scenario, and more efficiently computable. We also consider the scenario where each user has multiple attributes, for which we define Side-channel Resistant Local Information Privacy, and we give efficient methods to find protocols satisfying this criterion while still offering good utility. Finally, we introduce Conditional Reporting, an explicit LIP protocol that can be used when the optimal protocol is infeasible to compute, and we test this protocol on real-world and synthetic data. Experiments on real-world and synthetic data confirm the validity of these methods.

CRNov 24, 2019
Improving Frequency Estimation under Local Differential Privacy

Milan Lopuhaä-Zwakenberg, Zitao Li, Boris Škorić et al.

Local Differential Privacy protocols are stochastic protocols used in data aggregation when individual users do not trust the data aggregator with their private data. In such protocols there is a fundamental tradeoff between user privacy and aggregator utility. In the setting of frequency estimation, established bounds on this tradeoff are either nonquantitative, or far from what is known to be attainable. In this paper, we use information-theoretical methods to significantly improve established bounds. We also show that the new bounds are attainable for binary inputs. Furthermore, our methods lead to improved frequency estimators, which we experimentally show to outperform state-of-the-art methods.

CROct 17, 2019
Information-theoretic metrics for Local Differential Privacy protocols

Milan Lopuhaä-Zwakenberg, Boris Škorić, Ninghui Li

Local Differential Privacy (LDP) protocols allow an aggregator to obtain population statistics about sensitive data of a userbase, while protecting the privacy of the individual users. To understand the tradeoff between aggregator utility and user privacy, we introduce new information-theoretic metrics for utility and privacy. Contrary to other LDP metrics, these metrics highlight the fact that the users and the aggregator are interested in fundamentally different domains of information. We show how our metrics relate to $\varepsilon$-LDP, the \emph{de facto} standard privacy metric, giving an information-theoretic interpretation to the latter. Furthermore, we use our metrics to quantitatively study the privacy-utility tradeoff for a number of popular protocols.

ITJul 15, 2019
Single-Component Privacy Guarantees in Helper Data Systems and Sparse Coding with Ambiguation

Behrooz Razeghi, Taras Stanko, Boris Škorić et al.

We investigate the privacy of two approaches to (biometric) template protection: Helper Data Systems and Sparse Ternary Coding with Ambiguization. In particular, we focus on a privacy property that is often overlooked, namely how much leakage exists about one specific binary property of one component of the feature vector. This property is e.g. the sign or an indicator that a threshold is exceeded. We provide evidence that both approaches are able to protect such sensitive binary variables, and discuss how system parameters need to be set.