CV IVJul 3, 2023

Investigating Data Memorization in 3D Latent Diffusion Models for Medical Image Synthesis

Salman Ul Hassan Dar, Arman Ghanaat, Jannik Kahmann, Isabelle Ayx, Theano Papavassiliu, Stefan O. Schoenberg, Sandy Engelhardt

arXiv:2307.01148v216.436 citationsh-index: 34

Originality Synthesis-oriented

AI Analysis

This addresses privacy risks in medical data sharing, but is incremental as it extends known memorization issues to a new domain.

The study investigated data memorization in 3D latent diffusion models for medical image synthesis, finding that these models memorize training data, highlighting a need for mitigation strategies.

Generative latent diffusion models have been established as state-of-the-art in data generation. One promising application is generation of realistic synthetic medical imaging data for open data sharing without compromising patient privacy. Despite the promise, the capacity of such models to memorize sensitive patient training data and synthesize samples showing high resemblance to training data samples is relatively unexplored. Here, we assess the memorization capacity of 3D latent diffusion models on photon-counting coronary computed tomography angiography and knee magnetic resonance imaging datasets. To detect potential memorization of training samples, we utilize self-supervised models based on contrastive learning. Our results suggest that such latent diffusion models indeed memorize training data, and there is a dire need for devising strategies to mitigate memorization.

View on arXiv PDF

Similar