Representation of Inorganic Synthesis Reactions and Prediction: Graphical Framework and Datasets
This addresses the challenge of determining how to synthesize predicted inorganic materials, which is a bottleneck in materials science, though the improvements are incremental.
The paper tackles the problem of predicting full synthesis pathways for inorganic materials by introducing the ActionGraph framework, which encodes chemical and procedural structures as directed acyclic graphs. Using 13,017 text-mined reactions, they show that incorporating PCA-reduced ActionGraph matrices into a k-nearest neighbors model improves operation length matching accuracy from 15.8% to 53.3%, though precursor and operation F1 scores increase only modestly by 1.34% and 2.76% respectively.
While machine learning has enabled the rapid prediction of inorganic materials with novel properties, the challenge of determining how to synthesize these materials remains largely unsolved. Previous work has largely focused on predicting precursors or reaction conditions, but only rarely on full synthesis pathways. We introduce the ActionGraph, a directed acyclic graph framework that encodes both the chemical and procedural structure, in terms of synthesis operations, of inorganic synthesis reactions. Using 13,017 text-mined solid-state synthesis reactions from the Materials Project, we show that incorporating PCA-reduced ActionGraph adjacency matrices into a $k$-nearest neighbors retrieval model significantly improves synthesis pathway prediction. While the ActionGraph framework only results in a 1.34% and 2.76% increase in precursor and operation F1 scores (average over varying numbers of PCA components) respectively, the operation length matching accuracy rises 3.4 times (from 15.8% to 53.3%). We observe an interesting trade-off where precursor prediction performance peaks at 10-11 PCA components while operation prediction continues improving up to 30 components. This suggests composition information dominates precursor selection while structural information is critical for operation sequencing. Overall, the ActionGraph framework demonstrates strong potential, and with further adoption, its full range of benefits can be effectively realized.