MTRL-SCI LG CHEM-PHJun 24, 2021

Machine learning to tame divergent density functional approximations: a new path to consensus materials design principles

Chenru Duan, Shuxin Chen, Michael G. Taylor, Fang Liu, Heather J. Kulik

arXiv:2106.13109v13.320 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of DFA bias in computational materials discovery, offering a more reliable method for researchers in chemistry and materials science, though it is incremental in improving existing workflows.

The paper tackles the problem of divergent predictions from different density functional approximations (DFAs) in materials screening, particularly for transition metal complexes, by introducing a consensus-based machine learning approach that improves correspondence with experimental compounds over single-DFA methods.

Computational virtual high-throughput screening (VHTS) with density functional theory (DFT) and machine-learning (ML)-acceleration is essential in rapid materials discovery. By necessity, efficient DFT-based workflows are carried out with a single density functional approximation (DFA). Nevertheless, properties evaluated with different DFAs can be expected to disagree for the cases with challenging electronic structure (e.g., open shell transition metal complexes, TMCs) for which rapid screening is most needed and accurate benchmarks are often unavailable. To quantify the effect of DFA bias, we introduce an approach to rapidly obtain property predictions from 23 representative DFAs spanning multiple families and "rungs" (e.g., semi-local to double hybrid) and basis sets on over 2,000 TMCs. Although computed properties (e.g., spin-state ordering and frontier orbital gap) naturally differ by DFA, high linear correlations persist across all DFAs. We train independent ML models for each DFA and observe convergent trends in feature importance; these features thus provide DFA-invariant, universal design rules. We devise a strategy to train ML models informed by all 23 DFAs and use them to predict properties (e.g., spin-splitting energy) of over 182k TMCs. By requiring consensus of the ANN-predicted DFA properties, we improve correspondence of these computational lead compounds with literature-mined, experimental compounds over the single-DFA approach typically employed. Both feature analysis and consensus-based ML provide efficient, alternative paths to overcome accuracy limitations of practical DFT.

View on arXiv PDF

Similar