CR CV LGJul 29, 2022

Content-Aware Differential Privacy with Conditional Invertible Neural Networks

Malte Tölle, Ullrich Köthe, Florian André, Benjamin Meder, Sandy Engelhardt

arXiv:2207.14625v18.75 citationsh-index: 58Has Code

Originality Incremental advance

AI Analysis

This work addresses privacy protection in image datasets, particularly for sensitive domains like medical imaging, by enabling differentially private modifications without disrupting important features, though it is incremental as it builds on existing invertible neural network methods.

The paper tackles the challenge of applying differential privacy to images by proposing content-aware differential privacy (CADP), which uses conditional invertible neural networks to add noise in the latent space, resulting in modified images that preserve details for downstream tasks while protecting privacy, with experiments on public and medical datasets.

Differential privacy (DP) has arisen as the gold standard in protecting an individual's privacy in datasets by adding calibrated noise to each data sample. While the application to categorical data is straightforward, its usability in the context of images has been limited. Contrary to categorical data the meaning of an image is inherent in the spatial correlation of neighboring pixels making the simple application of noise infeasible. Invertible Neural Networks (INN) have shown excellent generative performance while still providing the ability to quantify the exact likelihood. Their principle is based on transforming a complicated distribution into a simple one e.g. an image into a spherical Gaussian. We hypothesize that adding noise to the latent space of an INN can enable differentially private image modification. Manipulation of the latent space leads to a modified image while preserving important details. Further, by conditioning the INN on meta-data provided with the dataset we aim at leaving dimensions important for downstream tasks like classification untouched while altering other parts that potentially contain identifying information. We term our method content-aware differential privacy (CADP). We conduct experiments on publicly available benchmarking datasets as well as dedicated medical ones. In addition, we show the generalizability of our method to categorical data. The source code is publicly available at https://github.com/Cardio-AI/CADP.

View on arXiv PDF Code

Similar