IV CVJun 24, 2024

Multi-Aperture Fusion of Transformer-Convolutional Network (MFTC-Net) for 3D Medical Image Segmentation and Visualization

Siyavash Shabani, Muhammad Sohaib, Sahar A. Mohammed, Bahram Parvin

arXiv:2406.17080v16 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses segmentation accuracy and efficiency for medical imaging applications, representing an incremental improvement over existing methods.

The study tackled 3D medical image segmentation by introducing MFTC-Net, which integrates Swin Transformers and convolutional blocks with multi-aperture fusion, achieving a Dice score of 89.73 and HD95 of 7.31 on the Synapse dataset with reduced complexity of about 40 million parameters.

Vision Transformers have shown superior performance to the traditional convolutional-based frameworks in many vision applications, including but not limited to the segmentation of 3D medical images. To further advance this area, this study introduces the Multi-Aperture Fusion of Transformer-Convolutional Network (MFTC-Net), which integrates the output of Swin Transformers and their corresponding convolutional blocks using 3D fusion blocks. The Multi-Aperture incorporates each image patch at its original resolutions with its pyramid representation to better preserve minute details. The proposed architecture has demonstrated a score of 89.73 and 7.31 for Dice and HD95, respectively, on the Synapse multi-organs dataset an improvement over the published results. The improved performance also comes with the added benefits of the reduced complexity of approximately 40 million parameters. Our code is available at https://github.com/Siyavashshabani/MFTC-Net

View on arXiv PDF Code

Similar