GNAIFeb 19

Systematic Evaluation of Single-Cell Foundation Model Interpretability Reveals Attention Captures Co-Expression Rather Than Unique Regulatory Signal

arXiv:2602.17532v17 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses the problem of assessing mechanistic interpretability for researchers in computational biology, revealing that current attention mechanisms may be incremental for regulatory insights.

The study systematically evaluated interpretability in single-cell foundation models, finding that attention patterns capture biological information like co-expression but do not improve perturbation prediction, with trivial baselines outperforming attention-based methods (AUROC 0.81-0.88 vs. 0.70).

We present a systematic evaluation framework - thirty-seven analyses, 153 statistical tests, four cell types, two perturbation modalities - for assessing mechanistic interpretability in single-cell foundation models. Applying this framework to scGPT and Geneformer, we find that attention patterns encode structured biological information with layer-specific organisation - protein-protein interactions in early layers, transcriptional regulation in late layers - but this structure provides no incremental value for perturbation prediction: trivial gene-level baselines outperform both attention and correlation edges (AUROC 0.81-0.88 versus 0.70), pairwise edge scores add zero predictive contribution, and causal ablation of regulatory heads produces no degradation. These findings generalise from K562 to RPE1 cells; the attention-correlation relationship is context-dependent, but gene-level dominance is universal. Cell-State Stratified Interpretability (CSSI) addresses an attention-specific scaling failure, improving GRN recovery up to 1.85x. The framework establishes reusable quality-control standards for the field.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes