Deep Class-Specific Affinity-Guided Convolutional Network for Multimodal Unpaired Image Segmentation
This work provides an incremental improvement for medical image segmentation, specifically for scenarios with unpaired multimodal inputs, which is relevant for clinicians and researchers in medical imaging.
This paper addresses the challenge of multimodal medical image segmentation when input modalities are spatially unaligned. The authors propose a deep class-specific affinity-guided convolutional network that utilizes class-specific affinity matrices to encode hierarchical feature reasoning, enabling training with unpaired multimodal inputs. Their method outperforms state-of-the-art methods on two public multimodal benchmark datasets.
Multi-modal medical image segmentation plays an essential role in clinical diagnosis. It remains challenging as the input modalities are often not well-aligned spatially. Existing learning-based methods mainly consider sharing trainable layers across modalities and minimizing visual feature discrepancies. While the problem is often formulated as joint supervised feature learning, multiple-scale features and class-specific representation have not yet been explored. In this paper, we propose an affinity-guided fully convolutional network for multimodal image segmentation. To learn effective representations, we design class-specific affinity matrices to encode the knowledge of hierarchical feature reasoning, together with the shared convolutional layers to ensure the cross-modality generalization. Our affinity matrix does not depend on spatial alignments of the visual features and thus allows us to train with unpaired, multimodal inputs. We extensively evaluated our method on two public multimodal benchmark datasets and outperform state-of-the-art methods.