Connect the dots: Dataset Condensation, Differential Privacy, and Adversarial Uncertainty
This work addresses a theoretical gap in dataset condensation for machine learning researchers, but it appears incremental as it builds on prior connections between dataset condensation and differential privacy.
This paper tackles the problem of understanding the mechanism behind dataset condensation by linking it to differential privacy and adversarial uncertainty, proposing that adversarial uncertainty is the optimal method for selecting noise levels to achieve high-fidelity synthetic data with privacy guarantees.
Our work focuses on understanding the underpinning mechanism of dataset condensation by drawing connections with ($ε$, $δ$)-differential privacy where the optimal noise, $ε$, is chosen by adversarial uncertainty \cite{Grining2017}. We can answer the question about the inner workings of the dataset condensation procedure. Previous work \cite{dong2022} proved the link between dataset condensation (DC) and ($ε$, $δ$)-differential privacy. However, it is unclear from existing works on ablating DC to obtain a lower-bound estimate of $ε$ that will suffice for creating high-fidelity synthetic data. We suggest that adversarial uncertainty is the most appropriate method to achieve an optimal noise level, $ε$. As part of the internal dynamics of dataset condensation, we adopt a satisfactory scheme for noise estimation that guarantees high-fidelity data while providing privacy.