BMLGSep 3, 2025

SurGBSA: Learning Representations From Molecular Dynamics Simulations

arXiv:2509.03084v21 citationsh-index: 23
Originality Incremental advance
AI Analysis

This work addresses the bottleneck of slow physics-based calculations in drug discovery by enabling faster predictions for novel molecular structures.

The paper tackles the problem of using molecular dynamics simulations for representation learning by introducing SurGBSA, a surrogate model for MMGBSA, achieving a 27,927x speedup with only a -0.4% accuracy drop on pose ranking.

Self-supervised pretraining from static structures of drug-like compounds and proteins enable powerful learned feature representations. Learned features demonstrate state of the art performance on a range of predictive tasks including molecular properties, structure generation, and protein-ligand interactions. The majority of approaches are limited by their use of static structures and it remains an open question, how best to use atomistic molecular dynamics (MD) simulations to develop more generalized models to improve prediction accuracy for novel molecular structures. We present SURrogate mmGBSA (SurGBSA) as a new modeling approach for MD-based representation learning, which learns a surrogate function of the Molecular Mechanics Generalized Born Surface Area (MMGBSA). We show for the first time the benefits of physics-informed pre-training to train a surrogate MMGBSA model on a collection of over 1.4 million 3D trajectories collected from MD simulations of the CASF-2016 benchmark. SurGBSA demonstrates a dramatic 27,927x speedup versus a traditional physics-based single-point MMGBSA calculation while nearly matching single-point MMGBSA accuracy on the challenging pose ranking problem for identification of the correct top pose (-0.4% difference). Our work advances the development of molecular foundation models by showing model improvements when training on MD simulations. Models, code and training data are made publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes