CRMar 14

Missing Mass for Differentially Private Domain Discovery

arXiv:2603.1401632.8h-index: 5
AI Analysis

This work addresses privacy-preserving data analysis for applications like recommendation systems, but it is incremental as it builds on existing mechanisms and applies them to new variants.

The paper tackles the problem of differentially private domain discovery, where users have subsets of items from an unknown domain, and aims to output informative subsets. It shows that the Weighted Gaussian Mechanism (WGM) achieves near-optimal missing mass guarantees on Zipfian data and distribution-free guarantees, and when used as a precursor, improves utility for private top-k and k-hitting set problems, with experiments showing competitive or superior performance.

We study several problems in differentially private domain discovery, where each user holds a subset of items from a shared but unknown domain, and the goal is to output an informative subset of items. For set union, we show that the simple baseline Weighted Gaussian Mechanism (WGM) has a near-optimal $\ell_1$ missing mass guarantee on Zipfian data as well as a distribution-free $\ell_\infty$ missing mass guarantee. We then apply the WGM as a domain-discovery precursor for existing known-domain algorithms for private top-$k$ and $k$-hitting set and obtain new utility guarantees for their unknown domain variants. Finally, experiments demonstrate that all of our WGM-based methods are competitive with or outperform existing baselines for all three problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes