MTRL-SCI AI LGMay 11, 2024

Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning

Bowen Deng, Yunyeong Choi, Peichen Zhong, Janosh Riebesell, Shashwat Anand, Zhuohan Li, KyuJung Jun, Kristin A. Persson, Gerbrand Ceder

arXiv:2405.07105v17.332 citationsh-index: 158

Originality Incremental advance

AI Analysis

This addresses a key limitation in materials science simulations by improving the accuracy of universal interatomic potentials for complex atomic environments, though it is incremental as it builds on existing uMLIP frameworks.

The study identified a systematic softening effect in universal machine learning interatomic potentials (uMLIPs), where they under-predict energy and force in out-of-distribution atomic environments, and showed that fine-tuning with a single additional data point effectively corrects this issue.

Machine learning interatomic potentials (MLIPs) have introduced a new paradigm for atomic simulations. Recent advancements have seen the emergence of universal MLIPs (uMLIPs) that are pre-trained on diverse materials datasets, providing opportunities for both ready-to-use universal force fields and robust foundations for downstream machine learning refinements. However, their performance in extrapolating to out-of-distribution complex atomic environments remains unclear. In this study, we highlight a consistent potential energy surface (PES) softening effect in three uMLIPs: M3GNet, CHGNet, and MACE-MP-0, which is characterized by energy and force under-prediction in a series of atomic-modeling benchmarks including surfaces, defects, solid-solution energetics, phonon vibration modes, ion migration barriers, and general high-energy states. We find that the PES softening behavior originates from a systematic underprediction error of the PES curvature, which derives from the biased sampling of near-equilibrium atomic arrangements in uMLIP pre-training datasets. We demonstrate that the PES softening issue can be effectively rectified by fine-tuning with a single additional data point. Our findings suggest that a considerable fraction of uMLIP errors are highly systematic, and can therefore be efficiently corrected. This result rationalizes the data-efficient fine-tuning performance boost commonly observed with foundational MLIPs. We argue for the importance of a comprehensive materials dataset with improved PES sampling for next-generation foundational MLIPs.

View on arXiv PDF

Similar