WFM: 3D Wavelet Flow Matching for Ultrafast Multi-Modal MRI Synthesis
This work addresses the computational bottleneck of diffusion-based multi-modal MRI synthesis for clinical deployment, offering a practical speed-quality trade-off.
WFM proposes a wavelet flow matching method for multi-modal MRI synthesis that learns a direct flow from an informed prior (mean of conditioning modalities in wavelet space) to the target, enabling accurate synthesis in 1-2 steps. On BraTS 2024, it achieves 26.8 dB PSNR and 0.94 SSIM, running 250-1000x faster than diffusion models (0.16-0.64s vs. 160s per volume).
Diffusion models have achieved remarkable quality in multi-modal MRI synthesis, but their computational cost (hundreds of sampling steps and separate models per modality) limits clinical deployment. We observe that this inefficiency stems from an unnecessary starting point: diffusion begins from pure noise, discarding the structural information already present in available MRI sequences. We propose WFM (Wavelet Flow Matching), which instead learns a direct flow from an informed prior, the mean of conditioning modalities in wavelet space, to the target distribution. Because the source and target share underlying anatomy and differ primarily in contrast, this formulation enables accurate synthesis in just 1-2 integration steps. A single 82M-parameter model with class conditioning synthesizes all four BraTS modalities (T1, T1c, T2, FLAIR), replacing four separate diffusion models totaling 326M parameters. On BraTS 2024, WFM achieves 26.8 dB PSNR and 0.94 SSIM, within 1-2 dB of diffusion baselines, while running 250-1000x faster (0.16-0.64s vs. 160s per volume). This speed-quality trade-off makes real-time MRI synthesis practical for clinical workflows. Code is available at https://github.com/yalcintur/WFM.