SCORE-IT: A Machine Learning-based Tool for Automatic Standardization of EEG Reports
This addresses the lack of standardization in EEG reports, which hinders large-scale EEG-based machine learning models for neurological care, representing an incremental improvement in automating metadata extraction.
The researchers tackled the problem of automatically extracting metadata from unstructured EEG reports by developing a machine learning-based system that identifies seizure types, normality, and epilepsy diagnosis, achieving F1 scores of 0.92, 0.82, and 0.97 on the TUH EEG corpus.
Machine learning (ML)-based analysis of electroencephalograms (EEGs) is playing an important role in advancing neurological care. However, the difficulties in automatically extracting useful metadata from clinical records hinder the development of large-scale EEG-based ML models. EEG reports, which are the primary sources of metadata for EEG studies, suffer from lack of standardization. Here we propose a machine learning-based system that automatically extracts components from the SCORE specification from unstructured, natural-language EEG reports. Specifically, our system identifies (1) the type of seizure that was observed in the recording, per physician impression; (2) whether the session recording was normal or abnormal according to physician impression; (3) whether the patient was diagnosed with epilepsy or not. We performed an evaluation of our system using the publicly available TUH EEG corpus and report F1 scores of 0.92, 0.82, and 0.97 for the respective tasks.