On the Utility Recovery Incapability of Neural Net-based Differential Private Tabular Training Data Synthesizer under Privacy Deregulation
This work addresses a critical gap in auditing generative model privacy-utility tradeoffs for synthetic data applications, revealing an unexpected side effect that could impact their deployment.
The study investigated the privacy-utility tradeoff in neural net-based differential private tabular training data synthesizers, specifically DP-CTGAN and PATE-CTGAN, and found that privacy deregulation does not always lead to utility recovery, highlighting a practical limitation.
Devising procedures for auditing generative model privacy-utility tradeoff is an important yet unresolved problem in practice. Existing works concentrates on investigating the privacy constraint side effect in terms of utility degradation of the train on synthetic, test on real paradigm of synthetic data training. We push such understanding on privacy-utility tradeoff to next level by observing the privacy deregulation side effect on synthetic training data utility. Surprisingly, we discover the Utility Recovery Incapability of DP-CTGAN and PATE-CTGAN under privacy deregulation, raising concerns on their practical applications. The main message is Privacy Deregulation does NOT always imply Utility Recovery.