Reducing Intraspecies and Interspecies Covariate Shift in Traumatic Brain Injury EEG of Humans and Mice Using Transfer Euclidean Alignment
This work addresses the challenge of deploying machine learning models in real-world biomedical applications with scarce and variable data, though it is incremental as it applies a transfer learning technique to a specific domain.
The paper tackled the problem of high variability in EEG data across subjects and species for traumatic brain injury classification, introducing Transfer Euclidean Alignment to improve model robustness, resulting in average accuracy increases of 14.42% for intraspecies and 5.53% for interspecies datasets.
While analytics of sleep electroencephalography (EEG) holds certain advantages over other methods in clinical applications, high variability across subjects poses a significant challenge when it comes to deploying machine learning models for classification tasks in the real world. In such instances, machine learning models that exhibit exceptional performance on a specific dataset may not necessarily demonstrate similar proficiency when applied to a distinct dataset for the same task. The scarcity of high-quality biomedical data further compounds this challenge, making it difficult to evaluate the model's generality comprehensively. In this paper, we introduce Transfer Euclidean Alignment - a transfer learning technique to tackle the problem of the dearth of human biomedical data for training deep learning models. We tested the robustness of this transfer learning technique on various rule-based classical machine learning models as well as the EEGNet-based deep learning model by evaluating on different datasets, including human and mouse data in a binary classification task of detecting individuals with versus without traumatic brain injury (TBI). By demonstrating notable improvements with an average increase of 14.42% for intraspecies datasets and 5.53% for interspecies datasets, our findings underscore the importance of the use of transfer learning to improve the performance of machine learning and deep learning models when using diverse datasets for training.