Decomposing Private Image Generation via Coarse-to-Fine Wavelet Modeling
This work provides a method for researchers and practitioners to generate higher quality images from sensitive datasets with differential privacy, addressing the trade-off between privacy and utility in image generation.
The paper addresses the challenge of generating high-quality images from sensitive datasets while maintaining strong privacy guarantees. It proposes a two-stage spectral differential privacy framework that decomposes image generation into privacy-preserving low-frequency modeling and public high-frequency upsampling, resulting in improved image quality and style capture on MS-COCO and MM-CelebA-HQ datasets compared to existing DP methods.
Generative models trained on sensitive image datasets risk memorizing and reproducing individual training examples, making strong privacy guarantees essential. While differential privacy (DP) provides a principled framework for such guarantees, standard DP finetuning (e.g., with DP-SGD) often results in severe degradation of image quality, particularly in high-frequency textures, due to the indiscriminate addition of noise across all model parameters. In this work, we propose a spectral DP framework based on the hypothesis that the most privacy-sensitive portions of an image are often low-frequency components in the wavelet space (e.g., facial features and object shapes) while high-frequency components are largely generic and public. Based on this hypothesis, we propose the following two-stage framework for DP image generation with coarse image intermediaries: (1) DP finetune an autoregressive spectral image tokenizer model on the low-resolution wavelet coefficients of the sensitive images, and (2) perform high-resolution upsampling using a publicly pretrained super-resolution model. By restricting the privacy budget to the global structures of the image in the first stage, and leveraging the post-processing property of DP for detail refinement, we achieve promising trade-offs between privacy and utility. Experiments on the MS-COCO and MM-CelebA-HQ datasets show that our method generates images with improved quality and style capture relative to other leading DP image frameworks.