Self-supervised learning for crystal property prediction via denoising
This addresses the data scarcity issue in materials science for researchers and engineers, though it is incremental as it applies existing SSL concepts to a specific domain.
The paper tackles the problem of limited labeled data for crystal property prediction by proposing a self-supervised learning strategy based on denoising perturbed structures, resulting in models that outperform non-SSL approaches across various conditions.
Accurate prediction of the properties of crystalline materials is crucial for targeted discovery, and this prediction is increasingly done with data-driven models. However, for many properties of interest, the number of materials for which a specific property has been determined is much smaller than the number of known materials. To overcome this disparity, we propose a novel self-supervised learning (SSL) strategy for material property prediction. Our approach, crystal denoising self-supervised learning (CDSSL), pretrains predictive models (e.g., graph networks) with a pretext task based on recovering valid material structures when given perturbed versions of these structures. We demonstrate that CDSSL models out-perform models trained without SSL, across material types, properties, and dataset sizes.