BMAISep 30, 2022

Protein structure generation via folding diffusion

Microsoft
arXiv:2209.15611v2291 citationsh-index: 23Has Code
Originality Highly original
AI Analysis

This work addresses the challenge of generating diverse protein structures for biological discovery and disease treatment, representing a novel method for a known bottleneck in computational protein design.

The authors tackled the problem of generating novel, physically foldable protein structures by introducing a diffusion-based generative model that mimics the native folding process, resulting in the unconditional generation of highly realistic protein structures with complexity and patterns similar to natural proteins.

The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a new diffusion-based generative model that designs protein backbone structures via a procedure that mirrors the native folding process. We describe protein backbone structure as a series of consecutive angles capturing the relative orientation of the constituent amino acid residues, and generate new structures by denoising from a random, unfolded state towards a stable folded structure. Not only does this mirror how proteins biologically twist into energetically favorable conformations, the inherent shift and rotational invariance of this representation crucially alleviates the need for complex equivariant networks. We train a denoising diffusion probabilistic model with a simple transformer backbone and demonstrate that our resulting model unconditionally generates highly realistic protein structures with complexity and structural patterns akin to those of naturally-occurring proteins. As a useful resource, we release the first open-source codebase and trained models for protein structure diffusion.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes