LG AIMay 30

Dive into Waves: Morlet Spectral Transformer for Cross-Subject Emotion Decoding from EEG

arXiv:2606.0088437.7

Predicted impact top 65% in LG · last 90 daysOriginality Incremental advance

AI Analysis

For brain-computer interface researchers, this work provides a cost-effective and interpretable alternative to large-scale pretraining for cross-subject emotion recognition.

The paper tackles cross-subject emotion decoding from EEG, a challenging problem due to weak, noisy, and subject-variable spectral signals. The proposed Morlet Spectral Transformer (MST) outperforms both large pretrained EEG foundation models and frequency-based methods across all SEED-family datasets without pretraining.

We study cross-subject emotion recognition from EEG, a practically important yet challenging problem in brain-computer interfaces. Unlike tasks with clear waveform signatures, emotion-related EEG signals are primarily encoded in spectral power and are weak, noisy, and highly variable across subjects. Existing approaches rely either on large pretrained EEG foundation models, which require massive data yet still struggle with cross-subject variability, or frequency-domain encoders, which better reflect spectral structure but suffer from mismatched representations, drift-dominated tokenization, and lack of band-specific spatial modeling. In this article, we propose the Morlet Spectral Transformer (MST), built around three key components and integrated with a spatiotemporal Transformer backbone. First, Morlet wavelet tokenization provides a time-frequency representation that matches the multi-scale structure of brain rhythms, and extends classical differential entropy features to a form suitable for Transformers. Second, long-context baseline removal acts as a simple temporal normalization that removes subject-specific drift and redundancy across nearby windows. Third, frequency-specific spatial projection learns a separate channel mixer for each frequency band, capturing interpretable band-specific patterns and reducing cross-channel mixing. We show that, even without pretraining, MST consistently outperforms both large pretrained EEG foundation models and frequency-based methods across all SEED-family datasets. These results suggest that careful representation design can yield an accurate, cost-effective, and interpretable alternative to large-scale pretraining.

View on arXiv PDF

Similar