LGSep 29, 2025

Towards Understanding the Shape of Representations in Protein Language Models

arXiv:2509.24895v11 citationsh-index: 2

Originality Incremental advance

AI Analysis

This work addresses the interpretability gap in PLMs for protein design, though it is incremental by building on existing methods to analyze representation spaces.

The study tackled the problem of understanding how protein language models (PLMs) transform sequences into hidden representations and encode structural information, finding that PLMs preferentially encode immediate and local residue relations but degrade for larger context lengths, with the most structurally faithful encoding occurring near but before the last layer.

While protein language models (PLMs) are one of the most promising avenues of research for future de novo protein design, the way in which they transform sequences to hidden representations, as well as the information encoded in such representations is yet to be fully understood. Several works have attempted to propose interpretability tools for PLMs, but they have focused on understanding how individual sequences are transformed by such models. Therefore, the way in which PLMs transform the whole space of sequences along with their relations is still unknown. In this work we attempt to understand this transformed space of sequences by identifying protein structure and representation with square-root velocity (SRV) representations and graph filtrations. Both approaches naturally lead to a metric space in which pairs of proteins or protein representations can be compared with each other. We analyze different types of proteins from the SCOP dataset and show that the Karcher mean and effective dimension of the SRV shape space follow a non-linear pattern as a function of the layers in ESM2 models of different sizes. Furthermore, we use graph filtrations as a tool to study the context lengths at which models encode the structural features of proteins. We find that PLMs preferentially encode immediate as well as local relations between residues, but start to degrade for larger context lengths. The most structurally faithful encoding tends to occur close to, but before the last layer of the models, indicating that training a folding model ontop of these layers might lead to improved folding performance.

View on arXiv PDF

Similar