IVCVLGApr 10, 2024

Rethinking Perceptual Metrics for Medical Image Translation

arXiv:2404.07318v17 citationsh-index: 13
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of unreliable evaluation metrics for medical image translation, which is incremental as it highlights issues with existing methods and suggests a potential alternative.

The paper investigated evaluation metrics for medical image translation, finding that perceptual metrics like FID do not correlate well with segmentation metrics due to anatomical constraints, but the pixel-level SWD metric may be useful for subtle intra-modality translation.

Modern medical image translation methods use generative models for tasks such as the conversion of CT images to MRI. Evaluating these methods typically relies on some chosen downstream task in the target domain, such as segmentation. On the other hand, task-agnostic metrics are attractive, such as the network feature-based perceptual metrics (e.g., FID) that are common to image translation in general computer vision. In this paper, we investigate evaluation metrics for medical image translation on two medical image translation tasks (GE breast MRI to Siemens breast MRI and lumbar spine MRI to CT), tested on various state-of-the-art translation methods. We show that perceptual metrics do not generally correlate with segmentation metrics due to them extending poorly to the anatomical constraints of this sub-field, with FID being especially inconsistent. However, we find that the lesser-used pixel-level SWD metric may be useful for subtle intra-modality translation. Our results demonstrate the need for further research into helpful metrics for medical image translation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes