Local Information Privacy and Its Application to Privacy-Preserving Data Aggregation
This work addresses privacy protection in data aggregation without trusted third parties, offering incremental improvements over existing methods like LDP by leveraging prior knowledge.
The paper tackles the problem of privacy-preserving data aggregation by introducing local information privacy (LIP), which incorporates context-awareness to model adversary knowledge, and shows that LIP-based mechanisms achieve better utility-privacy tradeoffs than local differential privacy (LDP), with simulations confirming higher utility, especially with skewed prior distributions.
In this paper, we study local information privacy (LIP), and design LIP based mechanisms for statistical aggregation while protecting users' privacy without relying on a trusted third party. The notion of context-awareness is incorporated in LIP, which can be viewed as explicit modeling of the adversary's background knowledge. It enables the design of privacy-preserving mechanisms leveraging the prior distribution, which can potentially achieve a better utility-privacy tradeoff than context-free notions such as Local Differential Privacy (LDP). We present an optimization framework to minimize the mean square error in the data aggregation while protecting the privacy of each individual user's input data or a correlated latent variable while satisfying LIP constraints. Then, we study two different types of applications: (weighted) summation and histogram estimation and derive the optimal context-aware data perturbation parameters for each case, based on randomized response type of mechanism. We further compare the utility-privacy tradeoff between LIP and LDP and theoretically explain why the incorporation of prior knowledge enlarges feasible regions of the perturbation parameters, which thereby leads to higher utility. We also extend the LIP-based privacy mechanisms to the more general case when exact prior knowledge is not available. Finally, we validate our analysis by simulations using both synthetic and real-world data. Results show that our LIP-based privacy mechanism provides better utility-privacy tradeoffs than LDP, and the advantage of LIP is even more significant when the prior distribution is more skewed.