A Closer Look at Domain Shift for Deep Learning in Histopathology
This addresses generalization challenges for deep learning in medical imaging across different centers and scanners, but it is incremental as it focuses on analysis and a new evaluation measure rather than a breakthrough solution.
The study investigated domain shift in histopathology by analyzing how data augmentation and normalization affect convolutional neural networks for tumor classification, finding that training data preparation heavily influences learning and that latent representations are sensitive to distribution changes.
Domain shift is a significant problem in histopathology. There can be large differences in data characteristics of whole-slide images between medical centers and scanners, making generalization of deep learning to unseen data difficult. To gain a better understanding of the problem, we present a study on convolutional neural networks trained for tumor classification of H&E stained whole-slide images. We analyze how augmentation and normalization strategies affect performance and learned representations, and what features a trained model respond to. Most centrally, we present a novel measure for evaluating the distance between domains in the context of the learned representation of a particular model. This measure can reveal how sensitive a model is to domain variations, and can be used to detect new data that a model will have problems generalizing to. The results show how learning is heavily influenced by the preparation of training data, and that the latent representation used to do classification is sensitive to changes in data distribution, especially when training without augmentation or normalization.