QM LG SPNov 14, 2023

Understanding learning from EEG data: Combining machine learning and feature engineering based on hidden Markov models and mixed models

Gabriel Rodrigues Palma, Conor Thornberry, Seán Commins, Rafael de Andrade Moral

arXiv:2311.08113v12.37 citationsh-index: 4Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses EEG data interpretation challenges for cognitive neuroscience researchers, but is incremental in combining existing methods.

The paper tackled the problem of classifying learner vs. non-learner participants in spatial navigation tasks using EEG data, finding that deep neural networks achieved over 80% AUC with theta EEG data alone after standardization.

Theta oscillations, ranging from 4-8 Hz, play a significant role in spatial learning and memory functions during navigation tasks. Frontal theta oscillations are thought to play an important role in spatial navigation and memory. Electroencephalography (EEG) datasets are very complex, making any changes in the neural signal related to behaviour difficult to interpret. However, multiple analytical methods are available to examine complex data structure, especially machine learning based techniques. These methods have shown high classification performance and the combination with feature engineering enhances the capability of these methods. This paper proposes using hidden Markov and linear mixed effects models to extract features from EEG data. Based on the engineered features obtained from frontal theta EEG data during a spatial navigation task in two key trials (first, last) and between two conditions (learner and non-learner), we analysed the performance of six machine learning methods (Polynomial Support Vector Machines, Non-linear Support Vector Machines, Random Forests, K-Nearest Neighbours, Ridge, and Deep Neural Networks) on classifying learner and non-learner participants. We also analysed how different standardisation methods used to pre-process the EEG data contribute to classification performance. We compared the classification performance of each trial with data gathered from the same subjects, including solely coordinate-based features, such as idle time and average speed. We found that more machine learning methods perform better classification using coordinate-based data. However, only deep neural networks achieved an area under the ROC curve higher than 80% using the theta EEG data alone. Our findings suggest that standardising the theta EEG data and using deep neural networks enhances the classification of learner and non-learner subjects in a spatial learning task.

View on arXiv PDF Code

Similar