Semantic Map Guided Synthesis of Wireless Capsule Endoscopy Images using Diffusion Models
This addresses data scarcity for medical imaging researchers, but it is incremental as it applies an existing generative method to a specific domain.
The paper tackles the problem of insufficient training data for wireless capsule endoscopy (WCE) image analysis by proposing a diffusion model that generates diverse WCE images using semantic maps, showing effectiveness in producing realistic images as evaluated by visual tests.
Wireless capsule endoscopy (WCE) is a non-invasive method for visualizing the gastrointestinal (GI) tract, crucial for diagnosing GI tract diseases. However, interpreting WCE results can be time-consuming and tiring. Existing studies have employed deep neural networks (DNNs) for automatic GI tract lesion detection, but acquiring sufficient training examples, particularly due to privacy concerns, remains a challenge. Public WCE databases lack diversity and quantity. To address this, we propose a novel approach leveraging generative models, specifically the diffusion model (DM), for generating diverse WCE images. Our model incorporates semantic map resulted from visualization scale (VS) engine, enhancing the controllability and diversity of generated images. We evaluate our approach using visual inspection and visual Turing tests, demonstrating its effectiveness in generating realistic and diverse WCE images.