CV AIOct 28, 2021

Dispensed Transformer Network for Unsupervised Domain Adaptation

Yunxiang Li, Jingxiong Li, Ruilong Dan, Shuai Wang, Kai Jin, Guodong Zeng, Jun Wang, Xiangji Pan, Qianni Zhang, Huiyu Zhou, Qun Jin, Li Wang

arXiv:2110.14944v14.75 citationsh-index: 87

Originality Incremental advance

AI Analysis

This work addresses the challenge of unsupervised domain adaptation for medical image segmentation, which is crucial for reducing annotation costs and improving cross-site or cross-modality performance, though it appears incremental as it builds on existing UDA and Transformer methods.

The paper tackles the problem of costly data annotation and performance degradation in cross-domain medical image segmentation by introducing the Dispensed Transformer Network (DTNet), which achieves state-of-the-art performance on a large fluorescein angiography retinal dataset and a cross-modality challenge dataset.

Accurate segmentation is a crucial step in medical image analysis and applying supervised machine learning to segment the organs or lesions has been substantiated effective. However, it is costly to perform data annotation that provides ground truth labels for training the supervised algorithms, and the high variance of data that comes from different domains tends to severely degrade system performance over cross-site or cross-modality datasets. To mitigate this problem, a novel unsupervised domain adaptation (UDA) method named dispensed Transformer network (DTNet) is introduced in this paper. Our novel DTNet contains three modules. First, a dispensed residual transformer block is designed, which realizes global attention by dispensed interleaving operation and deals with the excessive computational cost and GPU memory usage of the Transformer. Second, a multi-scale consistency regularization is proposed to alleviate the loss of details in the low-resolution output for better feature alignment. Finally, a feature ranking discriminator is introduced to automatically assign different weights to domain-gap features to lessen the feature distribution distance, reducing the performance shift of two domains. The proposed method is evaluated on large fluorescein angiography (FA) retinal nonperfusion (RNP) cross-site dataset with 676 images and a wide used cross-modality dataset from the MM-WHS challenge. Extensive results demonstrate that our proposed network achieves the best performance in comparison with several state-of-the-art techniques.

View on arXiv PDF

Similar