CR AIJul 29, 2024

Unleash the Power of Ellipsis: Accuracy-enhanced Sparse Vector Technique with Exponential Noise

Yuhan Liu, Sheng Wang, Yixuan Liu, Feifei Li, Hong Chen

arXiv:2407.20068v12.31 citationsh-index: 4

Originality Incremental advance

AI Analysis

This work addresses accuracy limitations in differential privacy mechanisms for adaptive data analysis, offering incremental improvements to a fundamental tool.

The paper tackles the problem of low query accuracy in the Sparse Vector Technique (SVT) for differential privacy, which arises from conservative privacy analysis leading to excessive noise injection. By providing a new privacy analysis that considers SVT's less informative nature, they identify exponential noise as optimal and develop correction methods, achieving improvements up to 50% in metrics like precision and recall.

The Sparse Vector Technique (SVT) is one of the most fundamental tools in differential privacy (DP). It works as a backbone for adaptive data analysis by answering a sequence of queries on a given dataset, and gleaning useful information in a privacy-preserving manner. Unlike the typical private query releases that directly publicize the noisy query results, SVT is less informative -- it keeps the noisy query results to itself and only reveals a binary bit for each query, indicating whether the query result surpasses a predefined threshold. To provide a rigorous DP guarantee for SVT, prior works in the literature adopt a conservative privacy analysis by assuming the direct disclosure of noisy query results as in typical private query releases. This approach, however, hinders SVT from achieving higher query accuracy due to an overestimation of the privacy risks, which further leads to an excessive noise injection using the Laplacian or Gaussian noise for perturbation. Motivated by this, we provide a new privacy analysis for SVT by considering its less informative nature. Our analysis results not only broaden the range of applicable noise types for perturbation in SVT, but also identify the exponential noise as optimal among all evaluated noises (which, however, is usually deemed non-applicable in prior works). The main challenge in applying exponential noise to SVT is mitigating the sub-optimal performance due to the bias introduced by noise distributions. To address this, we develop a utility-oriented optimal threshold correction method and an appending strategy, which enhances the performance of SVT by increasing the precision and recall, respectively. The effectiveness of our proposed methods is substantiated both theoretically and empirically, demonstrating significant improvements up to $50\%$ across evaluated metrics.

View on arXiv PDF

Similar