COMP-PH LGMar 17, 2023

Towards a Foundation Model for Neural Network Wavefunctions

Michael Scherbela, Leon Gerard, Philipp Grohs

arXiv:2303.09949v16.611 citationsh-index: 34

Originality Highly original

AI Analysis

This work addresses the computational bottleneck in quantum chemistry for researchers, offering a potential foundation model to accelerate ab-initio energy calculations, though it is incremental as it builds on existing neural network wavefunction methods.

The authors tackled the high computational cost of optimizing neural network wavefunctions from scratch for each new quantum chemical system by proposing a novel ansatz that maps cheap Hartree-Fock orbitals to accurate neural network orbitals, enabling transfer learning from smaller fragments to larger compounds with minimal fine-tuning.

Deep neural networks have become a highly accurate and powerful wavefunction ansatz in combination with variational Monte Carlo methods for solving the electronic Schrödinger equation. However, despite their success and favorable scaling, these methods are still computationally too costly for wide adoption. A significant obstacle is the requirement to optimize the wavefunction from scratch for each new system, thus requiring long optimization. In this work, we propose a novel neural network ansatz, which effectively maps uncorrelated, computationally cheap Hartree-Fock orbitals, to correlated, high-accuracy neural network orbitals. This ansatz is inherently capable of learning a single wavefunction across multiple compounds and geometries, as we demonstrate by successfully transferring a wavefunction model pre-trained on smaller fragments to larger compounds. Furthermore, we provide ample experimental evidence to support the idea that extensive pre-training of a such a generalized wavefunction model across different compounds and geometries could lead to a foundation wavefunction model. Such a model could yield high-accuracy ab-initio energies using only minimal computational effort for fine-tuning and evaluation of observables.

View on arXiv PDF

Similar