CRLGMLMay 22, 2023

Differential Privacy with Random Projections and Sign Random Projections

arXiv:2306.01751v211 citations
AI Analysis

This work addresses privacy preservation in data analysis for fields like machine learning and data mining, offering incremental improvements over existing DP techniques.

The paper tackles the challenge of improving differential privacy (DP) in machine learning applications by introducing algorithms based on random projections and sign random projections, with DP-SignOPORP outperforming existing methods under standard DP settings and iDP-SignRP achieving high utility under individual DP at small epsilon values.

In this paper, we develop a series of differential privacy (DP) algorithms from a family of random projections (RP) for general applications in machine learning, data mining, and information retrieval. Among the presented algorithms, iDP-SignRP is remarkably effective under the setting of ``individual differential privacy'' (iDP), based on sign random projections (SignRP). Also, DP-SignOPORP considerably improves existing algorithms in the literature under the standard DP setting, using ``one permutation + one random projection'' (OPORP), where OPORP is a variant of the celebrated count-sketch method with fixed-length binning and normalization. Without taking signs, among the DP-RP family, DP-OPORP achieves the best performance. Our key idea for improving DP-RP is to take only the signs, i.e., $sign(x_j) = sign\left(\sum_{i=1}^p u_i w_{ij}\right)$, of the projected data. The intuition is that the signs often remain unchanged when the original data ($u$) exhibit small changes (according to the ``neighbor'' definition in DP). In other words, the aggregation and quantization operations themselves provide good privacy protections. We develop a technique called ``smooth flipping probability'' that incorporates this intuitive privacy benefit of SignRPs and improves the standard DP bit flipping strategy. Based on this technique, we propose DP-SignOPORP which satisfies strict DP and outperforms other DP variants based on SignRP (and RP), especially when $ε$ is not very large (e.g., $ε= 5\sim10$). Moreover, if an application scenario accepts individual DP, then we immediately obtain an algorithm named iDP-SignRP which achieves excellent utilities even at small~$ε$ (e.g., $ε<0.5$).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes