GRApr 14

Calibrated Abstention for Reliable TCR--pMHC Binding Prediction under Epitope Shift

arXiv:2604.132542.8h-index: 1
AI Analysis

For immunologists and clinicians using TCR-pMHC binding models, this work provides a principled way to handle epitope shift, improving reliability in vaccine and T-cell therapy design.

The paper addresses overconfidence in TCR-pMHC binding prediction when encountering unseen epitopes. By framing it as selective prediction with calibrated abstention, they achieve AUROC 0.813 and ECE 0.043 under epitope-held-out splits, reducing ECE by 69.7% and error rate from 18.7% to 10.9% at 80% coverage.

Predicting T-cell receptor (TCR)--peptide-MHC (pMHC) binding is central to vaccine design and T-cell therapy, yet deployed models frequently encounter epitopes unseen during training, causing silent overconfidence and unreliable prioritization. We address this by framing TCR--pMHC prediction as a \emph{selective prediction} problem: a calibrated model should either output a trustworthy confidence score or explicitly abstain. Concretely, we (1) introduce a dual-encoder architecture encoding both CDR3$α$/CDR3$β$ and peptide sequences via a pre-trained protein language model; (2) apply temperature scaling to correct systematic probability miscalibration; and (3) impose a conformal abstention rule that provides finite-sample coverage guarantees at a user-specified target error rate. Evaluated under three split strategies -- random, epitope-held-out, and distance-aware -- our method achieves AUROC 0.813 and ECE 0.043 under the challenging epitope-held-out protocol, reducing ECE by 69.7\% relative to an uncalibrated baseline. At 80\% coverage, the selective model further reduces error rate from 18.7\% to 10.9\%, demonstrating that calibrated abstention enables principled coverage-risk trade-offs aligned with practical screening budgets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes