CR AIApr 2, 2025

From Easy to Hard: Building a Shortcut for Differentially Private Image Synthesis

Kecen Li, Chen Gong, Xiaochen Li, Yuzhong Zhao, Xinwen Hou, Tianhao Wang

arXiv:2504.01395v28 citationsh-index: 5S&P

Originality Incremental advance

AI Analysis

This work addresses privacy leakage concerns for organizations sharing synthetic images by improving the performance of DP image synthesis, though it is incremental as it builds on existing DP-SGD methods.

The paper tackles the problem of poor performance in differentially private image synthesis by proposing a two-stage framework that uses curriculum learning, where diffusion models first learn simple features from aggregated 'central images' with minimal privacy cost, resulting in synthetic images with 33.1% better fidelity and 2.1% better utility on average compared to the state-of-the-art method.

Differentially private (DP) image synthesis aims to generate synthetic images from a sensitive dataset, alleviating the privacy leakage concerns of organizations sharing and utilizing synthetic images. Although previous methods have significantly progressed, especially in training diffusion models on sensitive images with DP Stochastic Gradient Descent (DP-SGD), they still suffer from unsatisfactory performance. In this work, inspired by curriculum learning, we propose a two-stage DP image synthesis framework, where diffusion models learn to generate DP synthetic images from easy to hard. Unlike existing methods that directly use DP-SGD to train diffusion models, we propose an easy stage in the beginning, where diffusion models learn simple features of the sensitive images. To facilitate this easy stage, we propose to use `central images', simply aggregations of random samples of the sensitive dataset. Intuitively, although those central images do not show details, they demonstrate useful characteristics of all images and only incur minimal privacy costs, thus helping early-phase model training. We conduct experiments to present that on the average of four investigated image datasets, the fidelity and utility metrics of our synthetic images are 33.1% and 2.1% better than the state-of-the-art method.

View on arXiv PDF

Similar