CVJun 30, 2023

Why does my medical AI look at pictures of birds? Exploring the efficacy of transfer learning across domain boundaries

Frederic Jonske, Moon Kim, Enrico Nasca, Janis Evers, Johannes Haubold, René Hosch, Felix Nensa, Michael Kamp, Constantin Seibold, Jan Egger, Jens Kleesiek

arXiv:2306.17555v22.85 citationsh-index: 38Has Code

Originality Incremental advance

AI Analysis

This addresses the issue of suboptimal transfer learning for medical imaging practitioners, though it is incremental as it builds on existing self-supervised pretraining methods.

The paper tackles the problem of whether pretraining on domain-specific data improves performance over using ImageNet-pretrained models in medical AI, finding that intra-domain transfer with RadNet-12M yields a 0.44% to 2.07% performance increase compared to cross-domain transfer.

It is an open secret that ImageNet is treated as the panacea of pretraining. Particularly in medical machine learning, models not trained from scratch are often finetuned based on ImageNet-pretrained models. We posit that pretraining on data from the domain of the downstream task should almost always be preferred instead. We leverage RadNet-12M, a dataset containing more than 12 million computed tomography (CT) image slices, to explore the efficacy of self-supervised pretraining on medical and natural images. Our experiments cover intra- and cross-domain transfer scenarios, varying data scales, finetuning vs. linear evaluation, and feature space analysis. We observe that intra-domain transfer compares favorably to cross-domain transfer, achieving comparable or improved performance (0.44% - 2.07% performance increase using RadNet pretraining, depending on the experiment) and demonstrate the existence of a domain boundary-related generalization gap and domain-specific learned features.

View on arXiv PDF Code

Similar