Evaluating Utility of Memory Efficient Medical Image Generation: A Study on Lung Nodule Segmentation
This addresses data limitations for medical AI developers, though it is incremental as it adapts existing diffusion methods to a specific domain.
The authors tackled the scarcity of medical imaging data by developing a memory-efficient patch-wise diffusion model to generate synthetic CT scans with lung nodules. Their synthetic data achieved Dice scores comparable to real-world benchmarks when used alone and significantly improved segmentation performance when augmenting real data.
The scarcity of publicly available medical imaging data limits the development of effective AI models. This work proposes a memory-efficient patch-wise denoising diffusion probabilistic model (DDPM) for generating synthetic medical images, focusing on CT scans with lung nodules. Our approach generates high-utility synthetic images with nodule segmentation while efficiently managing memory constraints, enabling the creation of training datasets. We evaluate the method in two scenarios: training a segmentation model exclusively on synthetic data, and augmenting real-world training data with synthetic images. In the first case, models trained solely on synthetic data achieve Dice scores comparable to those trained on real-world data benchmarks. In the second case, augmenting real-world data with synthetic images significantly improves segmentation performance. The generated images demonstrate their potential to enhance medical image datasets in scenarios with limited real-world data.