BMLGBIO-PHSep 21, 2025

AI-based Methods for Simulating, Sampling, and Predicting Protein Ensembles

arXiv:2509.17224v16 citationsh-index: 108
Originality Synthesis-oriented
AI Analysis

This is an incremental review that synthesizes existing research for researchers in computational biology and protein science.

The paper reviews AI-based methods for predicting protein ensembles, addressing the lack of progress compared to single-structure predictions, and advocates for integrating model training, simulation, and inference to overcome data challenges.

Advances in deep learning have opened an era of abundant and accurate predicted protein structures; however, similar progress in protein ensembles has remained elusive. This review highlights several recent research directions towards AI-based predictions of protein ensembles, including coarse-grained force fields, generative models, multiple sequence alignment perturbation methods, and modeling of ensemble descriptors. An emphasis is placed on realistic assessments of the technological maturity of current methods, the strengths and weaknesses of broad families of techniques, and promising machine learning frameworks at an early stage of development. We advocate for "closing the loop" between model training, simulation, and inference to overcome challenges in training data availability and to enable the next generation of models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes