Andreas Tiffeau-Mayer

h-index81
2papers

2 Papers

BMDec 18, 2024
Data-driven Discovery of Biophysical T Cell Receptor Co-specificity Rules

Andrew G. T. Pyo, Yuta Nagano, Martina Milighetti et al.

The biophysical interactions between the T cell receptor (TCR) and its ligands determine the specificity of the cellular immune response. However, the immense diversity of receptors and ligands has made it challenging to discover generalizable rules across the distinct binding affinity landscapes created by different ligands. Here, we present an optimization framework for discovering biophysical rules that predict whether TCRs share specificity to a ligand. Applying this framework to TCRs associated with a collection of SARS-CoV-2 peptides we systematically characterize how co-specificity depends on the type and position of amino-acid differences between receptors. We also demonstrate that the inferred rules generalize to ligands highly dissimilar to any seen during training. Our analysis reveals that matching of steric properties between substituted amino acids is more important for receptor co-specificity than the hydrophobic properties that prominently determine evolutionary substitutability. Our analysis also quantifies the substantial importance of positions not in direct contact with the peptide for specificity. These findings highlight the potential for data-driven approaches to uncover the molecular mechanisms underpinning the specificity of adaptive immune responses.

BMJun 10, 2024
Contrastive learning of T cell receptor representations

Yuta Nagano, Andrew Pyo, Martina Milighetti et al.

Computational prediction of the interaction of T cell receptors (TCRs) and their ligands is a grand challenge in immunology. Despite advances in high-throughput assays, specificity-labelled TCR data remains sparse. In other domains, the pre-training of language models on unlabelled data has been successfully used to address data bottlenecks. However, it is unclear how to best pre-train protein language models for TCR specificity prediction. Here we introduce a TCR language model called SCEPTR (Simple Contrastive Embedding of the Primary sequence of T cell Receptors), capable of data-efficient transfer learning. Through our model, we introduce a novel pre-training strategy combining autocontrastive learning and masked-language modelling, which enables SCEPTR to achieve its state-of-the-art performance. In contrast, existing protein language models and a variant of SCEPTR pre-trained without autocontrastive learning are outperformed by sequence alignment-based methods. We anticipate that contrastive learning will be a useful paradigm to decode the rules of TCR specificity.