MTRL-SCIDec 2, 2025
Representation of Inorganic Synthesis Reactions and Prediction: Graphical Framework and DatasetsSamuel Andrello, Daniel Alabi, Simon J. L. Billinge
While machine learning has enabled the rapid prediction of inorganic materials with novel properties, the challenge of determining how to synthesize these materials remains largely unsolved. Previous work has largely focused on predicting precursors or reaction conditions, but only rarely on full synthesis pathways. We introduce the ActionGraph, a directed acyclic graph framework that encodes both the chemical and procedural structure, in terms of synthesis operations, of inorganic synthesis reactions. Using 13,017 text-mined solid-state synthesis reactions from the Materials Project, we show that incorporating PCA-reduced ActionGraph adjacency matrices into a $k$-nearest neighbors retrieval model significantly improves synthesis pathway prediction. While the ActionGraph framework only results in a 1.34% and 2.76% increase in precursor and operation F1 scores (average over varying numbers of PCA components) respectively, the operation length matching accuracy rises 3.4 times (from 15.8% to 53.3%). We observe an interesting trade-off where precursor prediction performance peaks at 10-11 PCA components while operation prediction continues improving up to 30 components. This suggests composition information dominates precursor selection while structural information is critical for operation sequencing. Overall, the ActionGraph framework demonstrates strong potential, and with further adoption, its full range of benefits can be effectively realized.
MTRL-SCIOct 22, 2024
Interpretable Multimodal Machine Learning Analysis of X-ray Absorption Near-Edge Spectra and Pair Distribution FunctionsTanaporn Na Narong, Zoe N. Zachko, Steven B. Torrisi et al.
We used interpretable machine learning to combine information from multiple heterogeneous spectra: X-ray absorption near-edge spectra (XANES) and atomic pair distribution functions (PDFs) to extract local structural and chemical environments of transition metal cations in oxides. Random forest models were trained on simulated XANES, PDF, and both combined to extract oxidation state, coordination number, and mean nearest-neighbor bond length. XANES-only models generally outperformed PDF-only models, even for structural tasks, although using the metal's differential PDFs (dPDFs) instead of total PDFs narrowed this gap. When combined with PDFs, information from XANES often dominates the prediction. Our results demonstrate that XANES contain rich structural information and highlight the utility of species-specificity. This interpretable, multimodal approach is quick to implement with suitable databases and offers valuable insights into the relative strengths of different modalities, guiding researchers in experiment design and identifying when combining complementary techniques adds meaningful information to a scientific investigation.
LGSep 1, 2025
CbLDM: A Diffusion Model for recovering nanostructure from pair distribution functionJiarui Cao, Zhiyang Zhang, Heming Wang et al.
Nowadays, the nanostructure inverse problem is an attractive problem that helps researchers to understand the relationship between the properties and the structure of nanomaterials. This article focuses on the problem of using PDF to recover the nanostructure, which this article views as a conditional generation problem. This article propose a deep learning model CbLDM, Condition-based Latent Diffusion Model. Based on the original latent diffusion model, the sampling steps of the diffusion model are reduced and the sample generation efficiency is improved by using the conditional prior to estimate conditional posterior distribution, which is the approximated distribution of p(z|x). In addition, this article uses the Laplacian matrix instead of the distance matrix to recover the nanostructure, which can reduce the reconstruction error. Finally, this article compares CbLDM with existing models which were used to solve the nanostructure inverse problem, and find that CbLDM demonstrates significantly higher prediction accuracy than these models, which reflects the ability of CbLDM to solve the nanostructure inverse problem and the potential to cope with other continuous conditional generation tasks.
MTRL-SCIOct 22, 2020
Validation of non-negative matrix factorization for assessment of atomic pair-distribution function (PDF) data in a real-time streaming contextChia-Hao Liu, Christopher J. Wright, Ran Gu et al.
We validate the use of matrix factorization for the automatic identification of relevant components from atomic pair distribution function (PDF) data. We also present a newly developed software infrastructure for analyzing the PDF data arriving in streaming manner. We then apply two matrix factorization techniques, Principal Component Analysis (PCA) and Non-negative Matrix Factorization (NMF), to study simulated and experiment datasets in the context of in situ experiment.
MTRL-SCIFeb 13, 2014
xPDFsuite: an end-to-end software solution for high throughput pair distribution function transformation, visualization and analysisXiaohao Yang, Pavol Juhas, Christopher L. Farrow et al.
The xPDFsuite software program is described. It is for processing and analyzing atomic pair distribution functions (PDF) from X-ray powder diffraction data. It provides a convenient GUI for SrXplanr and PDFgetX3, allowing the users to easily obtain 1D diffraction pattern from raw 2D diffraction images and then transform them to PDFs. It also bundles PDFgui which allows the users to create structure models and fit to the experiment data. It is specially useful for working with large numbers of datasets such as from high throughout measurements. Some of the key features are: real time PDF transformation and plotting; 2D waterfall, false color heatmap, and 3D contour plotting for multiple datasets; static and dynamic mask editing; geometric calibration of powder diffraction image; configurations and project saving and loading; Pearson correlation analysis on selected datasets; written in Python and support multiple platforms.