BMAIMar 9, 2025

Non-Canonical Crosslinks Confound Evolutionary Protein Structure Models

arXiv:2503.17368v11 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

This work highlights a critical limitation in current protein structure prediction for rare biomolecules, indicating an incremental step by benchmarking on a niche domain.

The authors tackled the problem of evolution-based protein structure prediction models failing on sequences with rare post-translational modifications, specifically sactipeptides with sulfur-to-α-carbon crosslinks, and found that these models performed poorly with GDT-TS scores ranging from 0.0% to 19.2%.

Evolution-based protein structure prediction models have achieved breakthrough success in recent years. However, they struggle to generalize beyond evolutionary priors and on sequences lacking rich homologous data. Here we present a novel, out-of-domain benchmark based on sactipeptides, a rare class of ribosomally synthesized and post-translationally modified peptides (RiPPs) characterized by sulfur-to-$α$-carbon thioether bridges creating cross-links between cysteine residues and backbone. We evaluate recent models on predicting conformations compatible with these cross-links bridges for the 10 known sactipeptides with elucidated post-translational modifications. Crucially, the structures of 5 of them have not yet been experimentally resolved. This makes the task a challenging problem for evolution-based models, which we find exhibit limited performance (0.0% to 19.2% GDT-TS on sulfur-to-$α$-carbon distance). Our results point at the need for physics-informed models to sustain progress in biomolecular structure prediction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes