LGNov 28, 2022

On the Utility Recovery Incapability of Neural Net-based Differential Private Tabular Training Data Synthesizer under Privacy Deregulation

arXiv:2211.15809v19 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses a critical gap in auditing generative model privacy-utility tradeoffs for synthetic data applications, revealing an unexpected side effect that could impact their deployment.

The study investigated the privacy-utility tradeoff in neural net-based differential private tabular training data synthesizers, specifically DP-CTGAN and PATE-CTGAN, and found that privacy deregulation does not always lead to utility recovery, highlighting a practical limitation.

Devising procedures for auditing generative model privacy-utility tradeoff is an important yet unresolved problem in practice. Existing works concentrates on investigating the privacy constraint side effect in terms of utility degradation of the train on synthetic, test on real paradigm of synthetic data training. We push such understanding on privacy-utility tradeoff to next level by observing the privacy deregulation side effect on synthetic training data utility. Surprisingly, we discover the Utility Recovery Incapability of DP-CTGAN and PATE-CTGAN under privacy deregulation, raising concerns on their practical applications. The main message is Privacy Deregulation does NOT always imply Utility Recovery.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes