CV AIFeb 16, 2021

TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation

arXiv:2102.08005v236.41386 citationsHas Code

Originality Highly original

AI Analysis

This addresses the need for more efficient and accurate segmentation in clinical applications, representing a novel hybrid approach rather than an incremental improvement.

The paper tackles the problem of medical image segmentation by proposing TransFuse, a parallel architecture that combines Transformers and CNNs to efficiently capture global dependencies and low-level details, achieving state-of-the-art results on multiple datasets with significant parameter reduction and speed improvement.

Medical image segmentation - the prerequisite of numerous clinical needs - has been significantly prospered by recent advances in convolutional neural networks (CNNs). However, it exhibits general limitations on modeling explicit long-range relation, and existing cures, resorting to building deep encoders along with aggressive downsampling operations, leads to redundant deepened networks and loss of localized details. Hence, the segmentation task awaits a better solution to improve the efficiency of modeling global contexts while maintaining a strong grasp of low-level details. In this paper, we propose a novel parallel-in-branch architecture, TransFuse, to address this challenge. TransFuse combines Transformers and CNNs in a parallel style, where both global dependency and low-level spatial details can be efficiently captured in a much shallower manner. Besides, a novel fusion technique - BiFusion module is created to efficiently fuse the multi-level features from both branches. Extensive experiments demonstrate that TransFuse achieves the newest state-of-the-art results on both 2D and 3D medical image sets including polyp, skin lesion, hip, and prostate segmentation, with significant parameter decrease and inference speed improvement.

View on arXiv PDF Code

Similar