LGJun 2, 2025

Self-Refining Training for Amortized Density Functional Theory

Majdi Hassan, Cristian Gabellini, Hatem Helal, Dominique Beaini, Kirill Neklyudov

arXiv:2506.01225v14.1h-index: 15Has Code

Originality Incremental advance

AI Analysis

This addresses the scalability problem for computational chemists and materials scientists by making DFT approximations more efficient, though it is an incremental improvement over existing amortized DFT methods.

The paper tackles the high computational cost of Density Functional Theory (DFT) calculations by proposing a self-refining training strategy that reduces dependency on large pre-collected datasets, achieving competitive accuracy with up to 50% less data in experiments.

Density Functional Theory (DFT) allows for predicting all the chemical and physical properties of molecular systems from first principles by finding an approximate solution to the many-body Schrödinger equation. However, the cost of these predictions becomes infeasible when increasing the scale of the energy evaluations, e.g., when calculating the ground-state energy for simulating molecular dynamics. Recent works have demonstrated that, for substantially large datasets of molecular conformations, Deep Learning-based models can predict the outputs of the classical DFT solvers by amortizing the corresponding optimization problems. In this paper, we propose a novel method that reduces the dependency of amortized DFT solvers on large pre-collected datasets by introducing a self-refining training strategy. Namely, we propose an efficient method that simultaneously trains a deep-learning model to predict the DFT outputs and samples molecular conformations that are used as training data for the model. We derive our method as a minimization of the variational upper bound on the KL-divergence measuring the discrepancy between the generated samples and the target Boltzmann distribution defined by the ground state energy. To demonstrate the utility of the proposed scheme, we perform an extensive empirical study comparing it with the models trained on the pre-collected datasets. Finally, we open-source our implementation of the proposed algorithm, optimized with asynchronous training and sampling stages, which enables simultaneous sampling and training. Code is available at https://github.com/majhas/self-refining-dft.

View on arXiv PDF Code

Similar