LGIVMLAug 16, 2019

Knowledge distillation for semi-supervised domain adaptation

arXiv:1908.07355v10.0033 citations
AI Analysis50

This addresses domain adaptation in medical imaging for researchers and practitioners, offering a more generally applicable method without dataset-specific tuning, though it appears incremental as it builds on existing knowledge distillation techniques.

The paper tackles the problem of deep neural networks overfitting to limited annotated data and performing poorly on unseen domains, proposing knowledge distillation for semi-supervised domain adaptation to improve segmentation of white matter hyperintensities in MRI scans, achieving significantly higher dice scores compared to baseline and adversarial methods.

In the absence of sufficient data variation (e.g., scanner and protocol variability) in annotated data, deep neural networks (DNNs) tend to overfit during training. As a result, their performance is significantly lower on data from unseen sources compared to the performance on data from the same source as the training data. Semi-supervised domain adaptation methods can alleviate this problem by tuning networks to new target domains without the need for annotated data from these domains. Adversarial domain adaptation (ADA) methods are a popular choice that aim to train networks in such a way that the features generated are domain agnostic. However, these methods require careful dataset-specific selection of hyperparameters such as the complexity of the discriminator in order to achieve a reasonable performance. We propose to use knowledge distillation (KD) -- an efficient way of transferring knowledge between different DNNs -- for semi-supervised domain adaption of DNNs. It does not require dataset-specific hyperparameter tuning, making it generally applicable. The proposed method is compared to ADA for segmentation of white matter hyperintensities (WMH) in magnetic resonance imaging (MRI) scans generated by scanners that are not a part of the training set. Compared with both the baseline DNN (trained on source domain only and without any adaption to target domain) and with using ADA for semi-supervised domain adaptation, the proposed method achieves significantly higher WMH dice scores.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes