SP ET HC LG NCJun 21, 2025

An Interpretable Transformer-Based Foundation Model for Cross-Procedural Skill Assessment Using Raw fNIRS Signals

A. Subedi, S. De, L. Cavuoto, S. Schwaitzberg, M. Hackett, J. Norfleet

arXiv:2506.22476v11.2h-index: 31

Originality Highly original

AI Analysis

This work addresses the problem of robust and generalizable skill assessment for high-stakes procedural training, offering a method that reduces preprocessing needs and enhances interpretability, though it is incremental in improving upon existing fNIRS-based approaches.

The paper tackled objective skill assessment in procedural environments by introducing an interpretable transformer-based foundation model for cross-procedural skill assessment using raw fNIRS signals, achieving over 88% classification accuracy on tasks like laparoscopic surgery and endotracheal intubation and generalizing to a novel procedure with an AUC greater than 87% using few labeled samples.

Objective skill assessment in high-stakes procedural environments requires models that not only decode underlying cognitive and motor processes but also generalize across tasks, individuals, and experimental contexts. While prior work has demonstrated the potential of functional near-infrared spectroscopy (fNIRS) for evaluating cognitive-motor performance, existing approaches are often task-specific, rely on extensive preprocessing, and lack robustness to new procedures or conditions. Here, we introduce an interpretable transformer-based foundation model trained on minimally processed fNIRS signals for cross-procedural skill assessment. Pretrained using self-supervised learning on data from laparoscopic surgical tasks and endotracheal intubation (ETI), the model achieves greater than 88% classification accuracy on all tasks, with Matthews Correlation Coefficient exceeding 0.91 on ETI. It generalizes to a novel emergency airway procedure--cricothyrotomy--using fewer than 30 labeled samples and a lightweight (less than 2k parameter) adapter module, attaining an AUC greater than 87%. Interpretability is achieved via a novel channel attention mechanism--developed specifically for fNIRS--that identifies functionally coherent prefrontal sub-networks validated through ablation studies. Temporal attention patterns align with task-critical phases and capture stress-induced changes in neural variability, offering insight into dynamic cognitive states.

View on arXiv PDF

Similar