BMLGDec 10, 2024

Mask prior-guided denoising diffusion improves inverse protein folding

arXiv:2412.07815v28 citationsh-index: 5Nat Mach Intell
Originality Highly original
AI Analysis

This work addresses inverse protein folding for computational biology, offering improved sequence design for proteins with structural uncertainties.

The paper tackles the challenge of predicting amino acid sequences for protein structures with high uncertainty, such as disordered regions, by proposing a mask-prior-guided denoising diffusion framework (MapDiff) that substantially outperforms state-of-the-art methods on four benchmarks.

Inverse protein folding generates valid amino acid sequences that can fold into a desired protein structure, with recent deep-learning advances showing strong potential and competitive performance. However, challenges remain, such as predicting elements with high structural uncertainty, including disordered regions. To tackle such low-confidence residue prediction, we propose a Mask-prior-guided denoising Diffusion (MapDiff) framework that accurately captures both structural information and residue interactions for inverse protein folding. MapDiff is a discrete diffusion probabilistic model that iteratively generates amino acid sequences with reduced noise, conditioned on a given protein backbone. To incorporate structural information and residue interactions, we develop a graph-based denoising network with a mask-prior pre-training strategy. Moreover, in the generative process, we combine the denoising diffusion implicit model with Monte-Carlo dropout to reduce uncertainty. Evaluation on four challenging sequence design benchmarks shows that MapDiff substantially outperforms state-of-the-art methods. Furthermore, the in silico sequences generated by MapDiff closely resemble the physico-chemical and structural characteristics of native proteins across different protein families and architectures.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes