CRApr 2

Empirical Evaluation of Structured Synthetic Data Privacy Metrics: Novel experimental framework

arXiv:2512.162846.11 citationsh-index: 9
Predicted impact top 63% in CR · last 90 daysOriginality Synthesis-oriented
AI Analysis

For practitioners and researchers in synthetic data privacy, this framework provides a systematic way to benchmark privacy metrics, but the contribution is incremental as it primarily evaluates existing methods.

The paper proposes a framework to empirically evaluate synthetic data privacy metrics by deliberately inserting privacy risks, and applies it to existing no-box threat model methods on public datasets, finding that current metrics often fail to detect inserted risks.

Synthetic data generation is gaining traction as a privacy enhancing technology (PET). When properly generated, synthetic data preserve the analytic utility of real data while avoiding the retention of information that would allow the identification of specific individuals. However, the concept of data privacy remains elusive, making it challenging for practitioners to evaluate and benchmark the degree of privacy protection offered by synthetic data. In this paper, we propose a framework to empirically assess the efficacy of tabular synthetic data privacy quantification methods through controlled, deliberate risk insertion. To demonstrate this framework, we survey existing approaches to synthetic data privacy quantification and the related legal theory. We then apply the framework to the main privacy quantification methods with no-box threat models on publicly available datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes