IVCVLGMar 16, 2024

ContourDiff: Unpaired Image-to-Image Translation with Structural Consistency for Medical Imaging

arXiv:2403.10786v24 citationsh-index: 13Machine Learning for Biomedical Imaging
Originality Incremental advance
AI Analysis

This addresses the challenge of structural consistency in medical image translation, which is crucial for downstream clinical and machine learning applications, representing a domain-specific advancement.

The paper tackled the problem of preserving anatomical structures in unpaired image-to-image translation for medical imaging, such as CT-to-MRI, by proposing ContourDiff, which uses contour representations to constrain diffusion models, resulting in significant performance improvements over other methods across metrics like segmentation accuracy and image quality.

Preserving object structure through image-to-image translation is crucial, particularly in applications such as medical imaging (e.g., CT-to-MRI translation), where downstream clinical and machine learning applications will often rely on such preservation. However, typical image-to-image translation algorithms prioritize perceptual quality with respect to output domain features over the preservation of anatomical structures. To address these challenges, we first introduce a novel metric to quantify the structural bias between domains which must be considered for proper translation. We then propose ContourDiff, a novel image-to-image translation algorithm that leverages domain-invariant anatomical contour representations of images to preserve the anatomical structures during translation. These contour representations are simple to extract from images, yet form precise spatial constraints on their anatomical content. ContourDiff applies an input image contour representation as a constraint at every sampling step of a diffusion model trained in the output domain, ensuring anatomical content preservation for the output image. We evaluate our method on challenging lumbar spine and hip-and-thigh CT-to-MRI translation tasks, via (1) the performance of segmentation models trained on translated images applied to real MRIs, and (2) the foreground FID and KID of translated images with respect to real MRIs. Our method outperforms other unpaired image translation methods by a significant margin across almost all metrics and scenarios. Moreover, it achieves this without the need to access any input domain information during training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes