LGMar 3, 2025

How simple can you go? An off-the-shelf transformer approach to molecular dynamics

Max Eissler, Tim Korjakow, Stefan Ganscha, Oliver T. Unke, Klaus-Robert Müller, Stefan Gugler

arXiv:2503.01431v215.79 citationsh-index: 21Has CodeJ Chem Phys

Originality Incremental advance

AI Analysis

This work addresses the need for simpler, more general models in molecular dynamics, potentially reducing complexity for researchers, but it is incremental as it builds on existing trends questioning specialized architectures.

The authors tackled the problem of simplifying neural networks for molecular dynamics by using an off-the-shelf transformer with minimal modifications, achieving state-of-the-art results on several benchmarks after fine-tuning. They also proposed a method to distinguish errors from non-equivariance and other inaccuracies, though the model showed energy increases on larger structures.

Most current neural networks for molecular dynamics (MD) include physical inductive biases, resulting in specialized and complex architectures. This is in contrast to most other machine learning domains, where specialist approaches are increasingly replaced by general-purpose architectures trained on vast datasets. In line with this trend, several recent studies have questioned the necessity of architectural features commonly found in MD models, such as built-in rotational equivariance or energy conservation. In this work, we contribute to the ongoing discussion by evaluating the performance of an MD model with as few specialized architectural features as possible. We present a recipe for MD using an Edge Transformer, an "off-the-shelf'' transformer architecture that has been minimally modified for the MD domain, termed MD-ET. Our model implements neither built-in equivariance nor energy conservation. We use a simple supervised pre-training scheme on $\sim$30 million molecular structures from the QCML database. Using this "off-the-shelf'' approach, we show state-of-the-art results on several benchmarks after fine-tuning for a small number of steps. Additionally, we examine the effects of being only approximately equivariant and energy conserving for MD simulations, proposing a novel method for distinguishing the errors resulting from non-equivariance from other sources of inaccuracies like numerical rounding errors. While our model exhibits runaway energy increases on larger structures, we show approximately energy-conserving NVE simulations for a range of small structures.

View on arXiv PDF Code

Similar