Guided Synthesis of Labeled Brain MRI Data Using Latent Diffusion Models for Segmentation of Enlarged Ventricles
This work addresses data scarcity and privacy issues in medical imaging for clinicians and researchers, but it is incremental as it builds on existing latent diffusion models and focuses on a specific domain.
This study tackled the problem of improving ventricular segmentation in brain MRI images by using synthetic data generated with latent diffusion models to address data scarcity and privacy concerns. The results showed that models trained on synthetic and augmented data achieved lower mean absolute errors (e.g., 7.52 ± 4.81 mL for synthetic) and higher Dice scores (e.g., 0.892 ± 0.05 for augmented) compared to models trained on real data and outperformed the state-of-the-art SynthSeg model.
Deep learning models in medical contexts face challenges like data scarcity, inhomogeneity, and privacy concerns. This study focuses on improving ventricular segmentation in brain MRI images using synthetic data. We employed two latent diffusion models (LDMs): a mask generator trained using 10,000 masks, and a corresponding SPADE image generator optimized using 6,881 scans to create an MRI conditioned on a 3D brain mask. Conditioning the mask generator on ventricular volume in combination with classifier-free guidance enabled the control of the ventricular volume distribution of the generated synthetic images. Next, the performance of the synthetic data was tested using three nnU-Net segmentation models trained on a real, augmented and entirely synthetic data, respectively. The resulting models were tested on a completely independent hold-out dataset of patients with enlarged ventricles, with manual delineation of the ventricles used as ground truth. The model trained on real data showed a mean absolute error (MAE) of 9.09 \pm 12.18 mL in predicted ventricular volume, while the models trained on synthetic and augmented data showed MAEs of 7.52 \pm 4.81 mL and 6.23 \pm 4.33 mL, respectively. Both the synthetic and augmented model also outperformed the state-of-the-art model SynthSeg, which due to limited performance in cases of large ventricular volumes, showed an MAE of 7.73 \pm 12.12 mL with a factor of 3 higher standard deviation. The model trained on augmented data showed the highest Dice score of 0.892 \pm 0.05, slightly outperforming SynthSeg and on par with the model trained on real data. The synthetic model performed similar to SynthSeg. In summary, we provide evidence that guided synthesis of labeled brain MRI data using LDMs improves the segmentation of enlarged ventricles and outperforms existing state-of-the-art segmentation models.