CE AI BMOct 12, 2024

Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions

Xiaoran Jiao, Weian Mao, Wengong Jin, Peiyuan Yang, Hao Chen, Chunhua Shen

arXiv:2410.09543v15.96 citationsh-index: 16

Originality Incremental advance

AI Analysis

This work addresses a critical challenge in drug design by improving ΔΔG prediction for protein-protein interactions, though it is incremental as it builds on existing inverse folding methods with a novel alignment approach.

The paper tackles the problem of predicting changes in binding free energy (ΔΔG) for protein-protein interactions by proposing the Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models, achieving state-of-the-art Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised) on the SKEMPI v2 dataset.

Predicting the change in binding free energy ($ΔΔG$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $ΔΔG$ data, existing methods focus on pre-training, while neglecting the importance of alignment. In this work, we propose the Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to $ΔΔG$ prediction. We begin by analyzing the thermodynamic definition of $ΔΔG$ and introducing the Boltzmann distribution to connect energy with protein conformational distribution. However, the protein conformational distribution is intractable; therefore, we employ Bayes' theorem to circumvent direct estimation and instead utilize the log-likelihood provided by protein inverse folding models for $ΔΔG$ estimation. Compared to previous inverse folding-based methods, our method explicitly accounts for the unbound state of protein complex in the $ΔΔG$ thermodynamic cycle, introducing a physical inductive bias and achieving both supervised and unsupervised state-of-the-art (SoTA) performance. Experimental results on SKEMPI v2 indicate that our method achieves Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised), significantly surpassing the previously reported SoTA values of 0.2632 and 0.4324, respectively. Futhermore, we demonstrate the capability of our method on binding energy prediction, protein-protein docking and antibody optimization tasks.

View on arXiv PDF

Similar