CVApr 8, 2025

On the Importance of Conditioning for Privacy-Preserving Data Augmentation

arXiv:2504.05849v1h-index: 82025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Originality Synthesis-oriented
AI Analysis

This work addresses privacy risks in AI for data augmentation, revealing a critical vulnerability in existing methods, making it incremental by exposing flaws in prior approaches.

The paper tackles the problem of using conditioned latent diffusion models for privacy-preserving data augmentation, showing that such models are not suitable for anonymization because they can be exploited by contrastive learning and black-box attacks to identify individuals.

Latent diffusion models can be used as a powerful augmentation method to artificially extend datasets for enhanced training. To the human eye, these augmented images look very different to the originals. Previous work has suggested to use this data augmentation technique for data anonymization. However, we show that latent diffusion models that are conditioned on features like depth maps or edges to guide the diffusion process are not suitable as a privacy preserving method. We use a contrastive learning approach to train a model that can correctly identify people out of a pool of candidates. Moreover, we demonstrate that anonymization using conditioned diffusion models is susceptible to black box attacks. We attribute the success of the described methods to the conditioning of the latent diffusion model in the anonymization process. The diffusion model is instructed to produce similar edges for the anonymized images. Hence, a model can learn to recognize these patterns for identification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes