Deep Transfer Learning for Infectious Disease Case Detection Using Electronic Medical Records
This work addresses the challenge of cross-regional infectious disease detection using EMRs, but it is incremental as it applies existing transfer learning methods to this domain.
The study tackled the problem of distribution shift when sharing electronic medical records or models across regions for infectious disease detection, finding that transfer learning is useful when source and target are similar with insufficient or unlabeled target data, with model-based methods performing comparably to data-based ones in some cases.
During an infectious disease pandemic, it is critical to share electronic medical records or models (learned from these records) across regions. Applying one region's data/model to another region often have distribution shift issues that violate the assumptions of traditional machine learning techniques. Transfer learning can be a solution. To explore the potential of deep transfer learning algorithms, we applied two data-based algorithms (domain adversarial neural networks and maximum classifier discrepancy) and model-based transfer learning algorithms to infectious disease detection tasks. We further studied well-defined synthetic scenarios where the data distribution differences between two regions are known. Our experiments show that, in the context of infectious disease classification, transfer learning may be useful when (1) the source and target are similar and the target training data is insufficient and (2) the target training data does not have labels. Model-based transfer learning works well in the first situation, in which case the performance closely matched that of the data-based transfer learning models. Still, further investigation of the domain shift in real world research data to account for the drop in performance is needed.