LGAIBIO-PHBMQMMay 23, 2025

Simultaneous Modeling of Protein Conformation and Dynamics via Autoregression

arXiv:2505.17478v19 citationsh-index: 8
Originality Incremental advance
AI Analysis

This provides a novel and flexible approach for researchers in computational biology to efficiently explore protein conformational space, though it is incremental as it builds on existing protein folding and diffusion models.

The paper tackles the problem of modeling protein dynamics from molecular dynamics data by introducing ConfRover, an autoregressive model that simultaneously learns protein conformation and dynamics, enabling both time-dependent and time-independent sampling.

Understanding protein dynamics is critical for elucidating their biological functions. The increasing availability of molecular dynamics (MD) data enables the training of deep generative models to efficiently explore the conformational space of proteins. However, existing approaches either fail to explicitly capture the temporal dependencies between conformations or do not support direct generation of time-independent samples. To address these limitations, we introduce ConfRover, an autoregressive model that simultaneously learns protein conformation and dynamics from MD trajectories, supporting both time-dependent and time-independent sampling. At the core of our model is a modular architecture comprising: (i) an encoding layer, adapted from protein folding models, that embeds protein-specific information and conformation at each time frame into a latent space; (ii) a temporal module, a sequence model that captures conformational dynamics across frames; and (iii) an SE(3) diffusion model as the structure decoder, generating conformations in continuous space. Experiments on ATLAS, a large-scale protein MD dataset of diverse structures, demonstrate the effectiveness of our model in learning conformational dynamics and supporting a wide range of downstream tasks. ConfRover is the first model to sample both protein conformations and trajectories within a single framework, offering a novel and flexible approach for learning from protein MD data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes