CRDSAug 17, 2018

Cardinality Estimators do not Preserve Privacy

arXiv:1808.05879v354 citations
Originality Incremental advance
AI Analysis

This reveals a fundamental incompatibility between strong aggregation properties and privacy for cardinality estimators, impacting users in data analysis and privacy-sensitive applications.

The paper tackled the problem of whether cardinality estimators like HyperLogLog leak private information in privacy-sensitive contexts, showing that if they satisfy a privacy definition weaker than differential privacy, they cannot achieve reasonable accuracy, with significant average privacy loss even for large datasets.

Cardinality estimators like HyperLogLog are sketching algorithms that estimate the number of distinct elements in a large multiset. Their use in privacy-sensitive contexts raises the question of whether they leak private information. In particular, can they provide any privacy guarantees while preserving their strong aggregation properties? We formulate an abstract notion of cardinality estimators, that captures this aggregation requirement: one can merge sketches without losing precision. We propose an attacker model and a corresponding privacy definition, strictly weaker than differential privacy: we assume that the attacker has no prior knowledge of the data. We then show that if a cardinality estimator satisfies this definition, then it cannot have a reasonable level of accuracy. We prove similar results for weaker versions of our definition, and analyze the privacy of existing algorithms, showing that their average privacy loss is significant, even for multisets with large cardinalities. We conclude that strong aggregation requirements are incompatible with any reasonable definition of privacy, and that cardinality estimators should be considered as sensitive as raw data. We also propose risk mitigation strategies for their real-world applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes