IV CVFeb 10, 2025

Conditional diffusion model with spatial attention and latent embedding for medical image segmentation

Behzad Hejrati, Soumyanil Banerjee, Carri Glide-Hurst, Ming Dong

arXiv:2502.06997v215.26 citationsh-index: 32Has CodeMICCAI

Originality Incremental advance

AI Analysis

This work addresses medical image segmentation, a critical task for healthcare diagnostics, with incremental improvements in accuracy and speed.

The authors tackled medical image segmentation by proposing a conditional diffusion model with spatial attention and latent embedding (cDAL), which achieved higher Dice scores and mIoU over state-of-the-art methods on three public datasets.

Diffusion models have been used extensively for high quality image and video generation tasks. In this paper, we propose a novel conditional diffusion model with spatial attention and latent embedding (cDAL) for medical image segmentation. In cDAL, a convolutional neural network (CNN) based discriminator is used at every time-step of the diffusion process to distinguish between the generated labels and the real ones. A spatial attention map is computed based on the features learned by the discriminator to help cDAL generate more accurate segmentation of discriminative regions in an input image. Additionally, we incorporated a random latent embedding into each layer of our model to significantly reduce the number of training and sampling time-steps, thereby making it much faster than other diffusion models for image segmentation. We applied cDAL on 3 publicly available medical image segmentation datasets (MoNuSeg, Chest X-ray and Hippocampus) and observed significant qualitative and quantitative improvements with higher Dice scores and mIoU over the state-of-the-art algorithms. The source code is publicly available at https://github.com/Hejrati/cDAL/.

View on arXiv PDF Code

Similar