LG SP MLJul 24, 2023

Concept-based explainability for an EEG transformer model

Anders Gjølbye, William Lehn-Schiøler, Áshildur Jónsdóttir, Bergdís Arnardóttir, Lars Kai Hansen

arXiv:2307.12745v29.89 citationsh-index: 59Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses explainability for EEG models, which is incremental as it adapts an existing method to a new domain.

The authors tackled the problem of explaining deep learning models for EEG data by applying Concept Activation Vectors (CAVs) to a transformer model, showing that both externally labeled and anatomically defined concepts provide insights into learned representations.

Deep learning models are complex due to their size, structure, and inherent randomness in training procedures. Additional complexity arises from the selection of datasets and inductive biases. Addressing these challenges for explainability, Kim et al. (2018) introduced Concept Activation Vectors (CAVs), which aim to understand deep models' internal states in terms of human-aligned concepts. These concepts correspond to directions in latent space, identified using linear discriminants. Although this method was first applied to image classification, it was later adapted to other domains, including natural language processing. In this work, we attempt to apply the method to electroencephalogram (EEG) data for explainability in Kostas et al.'s BENDR (2021), a large-scale transformer model. A crucial part of this endeavor involves defining the explanatory concepts and selecting relevant datasets to ground concepts in the latent space. Our focus is on two mechanisms for EEG concept formation: the use of externally labeled EEG datasets, and the application of anatomically defined concepts. The former approach is a straightforward generalization of methods used in image classification, while the latter is novel and specific to EEG. We present evidence that both approaches to concept formation yield valuable insights into the representations learned by deep EEG models.

View on arXiv PDF Code

Similar