IVCVFeb 29, 2024

WDM: 3D Wavelet Diffusion Models for High-Resolution Medical Image Synthesis

arXiv:2402.19043v266 citationsh-index: 11DGM4MICCAI@MICCAI
AI Analysis

This work addresses the problem of high-resolution 3D medical image synthesis for researchers and clinicians, offering a scalable solution that outperforms existing methods, though it is incremental in applying wavelets to diffusion models.

The paper tackled the challenge of generating high-resolution 3D medical images by proposing WDM, a wavelet-based diffusion model that scales to resolutions like 128x128x128 and 256x256x256, achieving state-of-the-art FID and MS-SSIM scores on datasets such as BraTS and LIDC-IDRI.

Due to the three-dimensional nature of CT- or MR-scans, generative modeling of medical images is a particularly challenging task. Existing approaches mostly apply patch-wise, slice-wise, or cascaded generation techniques to fit the high-dimensional data into the limited GPU memory. However, these approaches may introduce artifacts and potentially restrict the model's applicability for certain downstream tasks. This work presents WDM, a wavelet-based medical image synthesis framework that applies a diffusion model on wavelet decomposed images. The presented approach is a simple yet effective way of scaling 3D diffusion models to high resolutions and can be trained on a single \SI{40}{\giga\byte} GPU. Experimental results on BraTS and LIDC-IDRI unconditional image generation at a resolution of $128 \times 128 \times 128$ demonstrate state-of-the-art image fidelity (FID) and sample diversity (MS-SSIM) scores compared to recent GANs, Diffusion Models, and Latent Diffusion Models. Our proposed method is the only one capable of generating high-quality images at a resolution of $256 \times 256 \times 256$, outperforming all comparing methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes