MediSyn: A Generalist Text-Guided Latent Diffusion Model For Diverse Medical Image Synthesis
This addresses data scarcity in medical imaging for researchers and developers, though it is incremental as it extends existing generative methods to a broader domain.
The authors tackled the problem of limited medical data availability by introducing MediSyn, a generalist text-guided latent diffusion model that generates synthetic images across 6 medical specialties and 10 image types, showing it matches or surpasses specialist models and improves classifier performance in data-limited settings.
Deep learning algorithms require extensive data to achieve robust performance. However, data availability is often restricted in the medical domain due to patient privacy concerns. Synthetic data presents a possible solution to these challenges. Recently, image generative models have found increasing use for medical applications but are often designed for singular medical specialties and imaging modalities, thus limiting their broader utility. To address this, we introduce MediSyn: a text-guided, latent diffusion model capable of generating synthetic images from 6 medical specialties and 10 image types. Through extensive experimentation, we first demonstrate that MediSyn quantitatively matches or surpasses the performance of specialist models. Second, we show that our synthetic images are realistic and exhibit strong alignment with their corresponding text prompts, as validated by a team of expert physicians. Third, we provide empirical evidence that our synthetic images are visually distinct from their corresponding real patient images. Finally, we demonstrate that in data-limited settings, classifiers trained solely on synthetic data or real data supplemented with synthetic data can outperform those trained solely on real data. Our findings highlight the immense potential of generalist image generative models to accelerate algorithmic research and development in medicine.