Ensemble learning using individual neonatal data for seizure detection
This work addresses the challenge of data sharing in medical institutions for neonatal seizure detection, offering an incremental improvement over existing methods by simulating a realistic scenario with disjoint data sets.
The paper tackled the problem of detecting neonatal seizures from EEG data when data cannot be shared between institutions by proposing an ensemble method that trains local detectors on disjoint data sets and aggregates their predictions. The ensemble achieved accuracy comparable to a single detector trained on all data, with the weighted mean aggregation scheme performing best, only marginally outperformed by the Dawid-Skene method under certain conditions.
Sharing medical data between institutions is difficult in practice due to data protection laws and official procedures within institutions. Therefore, most existing algorithms are trained on relatively small electroencephalogram (EEG) data sets which is likely to be detrimental to prediction accuracy. In this work, we simulate a case when the data can not be shared by splitting the publicly available data set into disjoint sets representing data in individual institutions. We propose to train a (local) detector in each institution and aggregate their individual predictions into one final prediction. Four aggregation schemes are compared, namely, the majority vote, the mean, the weighted mean and the Dawid-Skene method. The method was validated on an independent data set using only a subset of EEG channels. The ensemble reaches accuracy comparable to a single detector trained on all the data when sufficient amount of data is available in each institution. The weighted mean aggregation scheme showed best performance, it was only marginally outperformed by the Dawid--Skene method when local detectors approach performance of a single detector trained on all available data.