Iterative Pretraining Framework for Interatomic Potentials
This work addresses the problem of efficient and accurate molecular dynamics simulations for researchers in physical sciences, offering a domain-specific incremental improvement over existing methods.
The paper tackles the challenge of improving machine learning interatomic potentials (MLIPs) by addressing mismatches in pretraining objectives and reliance on large datasets, proposing an iterative pretraining framework (IPIP) that reduces prediction error by over 80% and achieves up to 4x speedup in simulations for the Mo-S-O system.
Machine learning interatomic potentials (MLIPs) enable efficient molecular dynamics (MD) simulations with ab initio accuracy and have been applied across various domains in physical science. However, their performance often relies on large-scale labeled training data. While existing pretraining strategies can improve model performance, they often suffer from a mismatch between the objectives of pretraining and downstream tasks or rely on extensive labeled datasets and increasingly complex architectures to achieve broad generalization. To address these challenges, we propose Iterative Pretraining for Interatomic Potentials (IPIP), a framework designed to iteratively improve the predictive performance of MLIP models. IPIP incorporates a forgetting mechanism to prevent iterative training from converging to suboptimal local minima. Unlike general-purpose foundation models, which frequently underperform on specialized tasks due to a trade-off between generality and system-specific accuracy, IPIP achieves higher accuracy and efficiency using lightweight architectures. Compared to general-purpose force fields, this approach achieves over 80% reduction in prediction error and up to 4x speedup in the challenging Mo-S-O system, enabling fast and accurate simulations.