Known Unknowns: Out-of-Distribution Property Prediction in Materials and Molecules
This work addresses the problem of discovering high-performance materials and molecules for researchers and engineers in the field of materials science and molecular design, providing an incremental improvement over existing methods.
The authors tackled the problem of out-of-distribution property prediction in materials and molecules, achieving a 3x and 2.5x improvement in True Positive Rate for materials and molecules respectively. The approach also improved precision by 2x and 1.5x compared to non-transductive baselines.
Discovery of high-performance materials and molecules requires identifying extremes with property values that fall outside the known distribution. Therefore, the ability to extrapolate to out-of-distribution (OOD) property values is critical for both solid-state materials and molecular design. Our objective is to train predictor models that extrapolate zero-shot to higher ranges than in the training data, given the chemical compositions of solids or molecular graphs and their property values. We propose using a transductive approach to OOD property prediction, achieving improvements in prediction accuracy. In particular, the True Positive Rate (TPR) of OOD classification of materials and molecules improved by 3x and 2.5x, respectively, and precision improved by 2x and 1.5x compared to non-transductive baselines. Our method leverages analogical input-target relations in the training and test sets, enabling generalization beyond the training target support, and can be applied to any other material and molecular tasks.