EigenFold: Generative Protein Structure Prediction with Diffusion Models
This work addresses the need for distributional modeling in protein structure prediction to capture conformational flexibility, which is crucial for understanding biological function, representing a novel method for a known bottleneck.
The authors tackled the problem of predicting protein conformational ensembles by developing EigenFold, a diffusion generative modeling framework that samples a distribution of structures from a protein sequence, achieving a median TMScore of 0.84 on recent CAMEO targets.
Protein structure prediction has reached revolutionary levels of accuracy on single structures, yet distributional modeling paradigms are needed to capture the conformational ensembles and flexibility that underlie biological function. Towards this goal, we develop EigenFold, a diffusion generative modeling framework for sampling a distribution of structures from a given protein sequence. We define a diffusion process that models the structure as a system of harmonic oscillators and which naturally induces a cascading-resolution generative process along the eigenmodes of the system. On recent CAMEO targets, EigenFold achieves a median TMScore of 0.84, while providing a more comprehensive picture of model uncertainty via the ensemble of sampled structures relative to existing methods. We then assess EigenFold's ability to model and predict conformational heterogeneity for fold-switching proteins and ligand-induced conformational change. Code is available at https://github.com/bjing2016/EigenFold.