Transfer Learning in ECG Diagnosis: Is It Effective?
This work addresses the problem of data scarcity in ECG diagnosis for medical researchers, providing evidence-based guidelines on when to use transfer learning, though it is incremental as it validates existing assumptions rather than introducing new methods.
The study systematically evaluated transfer learning for multi-label ECG classification, finding that fine-tuning is preferable for small datasets but training from scratch can match performance on large datasets with longer training times, and transfer learning works better with convolutional than recurrent neural networks.
The adoption of deep learning in ECG diagnosis is often hindered by the scarcity of large, well-labeled datasets in real-world scenarios, leading to the use of transfer learning to leverage features learned from larger datasets. Yet the prevailing assumption that transfer learning consistently outperforms training from scratch has never been systematically validated. In this study, we conduct the first extensive empirical study on the effectiveness of transfer learning in multi-label ECG classification, by investigating comparing the fine-tuning performance with that of training from scratch, covering a variety of ECG datasets and deep neural networks. We confirm that fine-tuning is the preferable choice for small downstream datasets; however, when the dataset is sufficiently large, training from scratch can achieve comparable performance, albeit requiring a longer training time to catch up. Furthermore, we find that transfer learning exhibits better compatibility with convolutional neural networks than with recurrent neural networks, which are the two most prevalent architectures for time-series ECG applications. Our results underscore the importance of transfer learning in ECG diagnosis, yet depending on the amount of available data, researchers may opt not to use it, considering the non-negligible cost associated with pre-training.