Large-scale machine-learning-assisted exploration of the whole materials space
This work addresses data imbalance issues in materials science for accelerated discovery of stable and functional inorganic compounds, representing a strong specific gain rather than a broad paradigm shift.
The authors tackled biases in crystal-graph networks by computing additional data to balance chemical and structural representation, achieving unprecedented generalization accuracy and enabling reliable exploration of inorganic compounds. They applied this network to screen about 1 billion compounds, uncovering 19,512 new thermodynamically stable materials and ~150,000 near-stable compounds, with several showing extreme properties for applications like superconductors.
Crystal-graph attention networks have emerged recently as remarkable tools for the prediction of thermodynamic stability and materials properties from unrelaxed crystal structures. Previous networks trained on two million materials exhibited, however, strong biases originating from underrepresented chemical elements and structural prototypes in the available data. We tackled this issue computing additional data to provide better balance across both chemical and crystal-symmetry space. Crystal-graph networks trained with this new data show unprecedented generalization accuracy, and allow for reliable, accelerated exploration of the whole space of inorganic compounds. We applied this universal network to perform machine-learning assisted high-throughput materials searches including 2500 binary and ternary structure prototypes and spanning about 1 billion compounds. After validation using density-functional theory, we uncover in total 19512 additional materials on the convex hull of thermodynamic stability and ~150000 compounds with a distance of less than 50 meV/atom from the hull. Combining again machine learning and ab-initio methods, we finally evaluate the discovered materials for applications as superconductors, superhard materials, and we look for candidates with large gap deformation potentials, finding several compounds with extreme values of these properties.