CVLGMay 21, 2024

Self-Supervised Modality-Agnostic Pre-Training of Swin Transformers

arXiv:2405.12781v12 citationsh-index: 31Has CodeISBI
Originality Incremental advance
AI Analysis

This work addresses domain shift issues in medical imaging for improved model generalizability, though it is incremental as it builds on existing Swin Transformer and multi-modal fusion techniques.

The paper tackled the problem of domain shift in unsupervised pre-training for medical imaging by augmenting the Swin Transformer to learn from multiple modalities (CT and MRI), resulting in a modest 1-2% trade-off on in-distribution tasks but up to 27% improvement on out-of-distribution modalities.

Unsupervised pre-training has emerged as a transformative paradigm, displaying remarkable advancements in various domains. However, the susceptibility to domain shift, where pre-training data distribution differs from fine-tuning, poses a significant obstacle. To address this, we augment the Swin Transformer to learn from different medical imaging modalities, enhancing downstream performance. Our model, dubbed SwinFUSE (Swin Multi-Modal Fusion for UnSupervised Enhancement), offers three key advantages: (i) it learns from both Computed Tomography (CT) and Magnetic Resonance Images (MRI) during pre-training, resulting in complementary feature representations; (ii) a domain-invariance module (DIM) that effectively highlights salient input regions, enhancing adaptability; (iii) exhibits remarkable generalizability, surpassing the confines of tasks it was initially pre-trained on. Our experiments on two publicly available 3D segmentation datasets show a modest 1-2% performance trade-off compared to single-modality models, yet significant out-performance of up to 27% on out-of-distribution modality. This substantial improvement underscores our proposed approach's practical relevance and real-world applicability. Code is available at: https://github.com/devalab/SwinFUSE

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes