Temporal Context and Architecture: A Benchmark for Naturalistic EEG Decoding
This work addresses EEG decoding for brain-computer interfaces, providing benchmarks for model selection based on efficiency and robustness, but it is incremental as it compares existing architectures on a specific dataset.
The study investigated how model architecture and temporal context affect naturalistic EEG decoding, finding that accuracy improves with longer context, with S5 achieving 98.7% accuracy at 64s using 20x fewer parameters than CNN, and revealing a trade-off between efficiency and robustness across architectures.
We study how model architecture and temporal context interact in naturalistic EEG decoding. Using the HBN movie-watching dataset, we benchmark five architectures, CNN, LSTM, a stabilized Transformer (EEGXF), S4, and S5, on a 4-class task across segment lengths from 8s to 128s. Accuracy improves with longer context: at 64s, S5 reaches 98.7%+/-0.6 and CNN 98.3%+/-0.3, while S5 uses ~20x fewer parameters than CNN. To probe real-world robustness, we evaluate zero-shot cross-frequency shifts, cross-task OOD inputs, and leave-one-subject-out generalization. S5 achieves stronger cross-subject accuracy but makes over-confident errors on OOD tasks; EEGXF is more conservative and stable under frequency shifts, though less calibrated in-distribution. These results reveal a practical efficiency-robustness trade-off: S5 for parameter-efficient peak accuracy; EEGXF when robustness and conservative uncertainty are critical.