Combining feature-based approaches with graph neural networks and symbolic regression for synergistic performance and interpretability
This work addresses the need for high-performance and interpretable models in materials informatics, offering a tool that combines deep learning power with chemical transparency for more targeted materials discovery.
The study tackled the problem of improving predictive performance and interpretability in materials science machine learning by introducing MatterVial, a hybrid framework that integrates graph neural network representations with symbolic regression features, resulting in error reductions and accuracy increases exceeding 40% on multiple tasks, making it competitive with or superior to state-of-the-art methods.
This study introduces MatterVial, an innovative hybrid framework for feature-based machine learning in materials science. MatterVial expands the feature space by integrating latent representations from a diverse suite of pretrained graph neural network (GNN) models including: structure-based (MEGNet), composition-based (ROOST), and equivariant (ORB) graph networks, with computationally efficient, GNN-approximated descriptors and novel features from symbolic regression. Our approach combines the chemical transparency of traditional feature-based models with the predictive power of deep learning architectures. When augmenting the feature-based model MODNet on Matbench tasks, this method yields significant error reductions and elevates its performance to be competitive with, and in several cases superior to, state-of-the-art end-to-end GNNs, with accuracy increases exceeding 40% for multiple tasks. An integrated interpretability module, employing surrogate models and symbolic regression, decodes the latent GNN-derived descriptors into explicit, physically meaningful formulas. This unified framework advances materials informatics by providing a high-performance, transparent tool that aligns with the principles of explainable AI, paving the way for more targeted and autonomous materials discovery.