Differentially Private Adaptation of Diffusion Models via Noisy Aggregated Embeddings
This addresses privacy risks in adapting AI models for sensitive domains like personal artwork, though it is incremental as it builds on existing Textual Inversion methods.
The paper tackled the problem of personalizing diffusion models on small, sensitive datasets while preserving privacy, and proposed DPAgg-TI, which outperforms DP-SGD finetuning by achieving results closely matching non-private baselines on style adaptation tasks.
Personalizing large-scale diffusion models poses serious privacy risks, especially when adapting to small, sensitive datasets. A common approach is to fine-tune the model using differentially private stochastic gradient descent (DP-SGD), but this suffers from severe utility degradation due to the high noise needed for privacy, particularly in the small data regime. We propose an alternative that leverages Textual Inversion (TI), which learns an embedding vector for an image or set of images, to enable adaptation under differential privacy (DP) constraints. Our approach, Differentially Private Aggregation via Textual Inversion (DPAgg-TI), adds calibrated noise to the aggregation of per-image embeddings to ensure formal DP guarantees while preserving high output fidelity. We show that DPAgg-TI outperforms DP-SGD finetuning in both utility and robustness under the same privacy budget, achieving results closely matching the non-private baseline on style adaptation tasks using private artwork from a single artist and Paris 2024 Olympic pictograms. In contrast, DP-SGD fails to generate meaningful outputs in this setting.