ModeTv2: GPU-accelerated Motion Decomposition Transformer for Pairwise Optimization in Medical Image Registration
This work addresses usability and precision challenges in medical image registration for disease diagnosis and interventions, representing an incremental improvement over existing deep learning methods.
The paper tackles the problem of slow and imprecise deformable image registration in medical imaging by introducing ModeTv2, a GPU-accelerated pyramid network with enhanced motion decomposition Transformer, which achieves superior pairwise optimization with improved computational efficiency and deformation realism across multiple brain MRI and abdominal CT datasets.
Deformable image registration plays a crucial role in medical imaging, aiding in disease diagnosis and image-guided interventions. Traditional iterative methods are slow, while deep learning (DL) accelerates solutions but faces usability and precision challenges. This study introduces a pyramid network with the enhanced motion decomposition Transformer (ModeTv2) operator, showcasing superior pairwise optimization (PO) akin to traditional methods. We re-implement ModeT operator with CUDA extensions to enhance its computational efficiency. We further propose RegHead module which refines deformation fields, improves the realism of deformation and reduces parameters. By adopting the PO, the proposed network balances accuracy, efficiency, and generalizability. Extensive experiments on three public brain MRI datasets and one abdominal CT dataset demonstrate the network's suitability for PO, providing a DL model with enhanced usability and interpretability. The code is publicly available at https://github.com/ZAX130/ModeTv2.