CHEM-PH LG COMP-PHOct 15, 2024

Benchmarking Data Efficiency in $Δ$-ML and Multifidelity Models for Quantum Chemistry

arXiv:2410.11391v31.2h-index: 4Has Code

Originality Synthesis-oriented

AI Analysis

It addresses data cost reduction for quantum chemistry calculations, but is incremental as it benchmarks existing methods with a new hybrid approach.

This work compares data efficiency of Δ-ML, multifidelity ML, and a new MFΔML method for predicting quantum chemistry properties, finding that multifidelity methods outperform standard Δ-ML for many predictions, while MFΔML is more efficient for few evaluations.

The development of machine learning (ML) methods has made quantum chemistry (QC) calculations more accessible by reducing the compute cost incurred in conventional QC methods. This has since been translated into the overhead cost of generating training data. Increased work in reducing the cost of generating training data resulted in the development of $Δ$-ML and multifidelity machine learning methods which use data at more than one QC level of accuracy, or fidelity. This work compares the data costs associated with $Δ$-ML, multifidelity machine learning (MFML), and optimized MFML (o-MFML) in contrast with a newly introduced Multifidelity$Δ$-Machine Learning (MF$Δ$ML) method for the prediction of ground state energies, vertical excitation energies, and the magnitude of electronic contribution of molecular dipole moments from the multifidelity benchmark dataset QeMFi. This assessment is made on the basis of training data generation cost associated with each model and is compared with the single fidelity kernel ridge regression (KRR) case. The results indicate that the use of multifidelity methods surpasses the standard $Δ$-ML approaches in cases of a large number of predictions. For applications which require only a few evaluations to be made using ML models, while the $Δ$-ML method might be favored, the MF$Δ$ML method is shown to be more efficient.

View on arXiv PDF Code

Similar