CRLGSep 2, 2025

Managing Correlations in Data and Privacy Demand

arXiv:2509.02856v12 citationsh-index: 28CCS
Originality Highly original
AI Analysis

This work addresses a critical limitation in privacy-preserving systems for real-world applications where user data and privacy preferences are often correlated, offering a more robust framework.

The authors tackled the problem of managing correlations between user data and privacy demands in differential privacy, proposing the Add-remove Heterogeneous Differential Privacy (AHDP) framework that is robust to such correlations and does not require prior knowledge of them, with mechanisms applied to tasks like mean estimation and linear regression.

Previous works in the differential privacy literature that allow users to choose their privacy levels typically operate under the heterogeneous differential privacy (HDP) framework with the simplifying assumption that user data and privacy levels are not correlated. Firstly, we demonstrate that the standard HDP framework falls short when user data and privacy demands are allowed to be correlated. Secondly, to address this shortcoming, we propose an alternate framework, Add-remove Heterogeneous Differential Privacy (AHDP), that jointly accounts for user data and privacy preference. We show that AHDP is robust to possible correlations between data and privacy. Thirdly, we formalize the guarantees of the proposed AHDP framework through an operational hypothesis testing perspective. The hypothesis testing setup may be of independent interest in analyzing other privacy frameworks as well. Fourthly, we show that there exists non-trivial AHDP mechanisms that notably do not require prior knowledge of the data-privacy correlations. We propose some such mechanisms and apply them to core statistical tasks such as mean estimation, frequency estimation, and linear regression. The proposed mechanisms are simple to implement with minimal assumptions and modeling requirements, making them attractive for real-world use. Finally, we empirically evaluate proposed AHDP mechanisms, highlighting their trade-offs using LLM-generated synthetic datasets, which we release for future research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes