LGAug 20, 2022
MLExchange: A web-based platform enabling exchangeable machine learning workflows for scientific studiesZhuowen Zhao, Tanny Chavez, Elizabeth A. Holman et al.
Machine learning (ML) algorithms are showing a growing trend in helping the scientific communities across different disciplines and institutions to address large and diverse data problems. However, many available ML tools are programmatically demanding and computationally costly. The MLExchange project aims to build a collaborative platform equipped with enabling tools that allow scientists and facility users who do not have a profound ML background to use ML and computational resources in scientific discovery. At the high level, we are targeting a full user experience where managing and exchanging ML algorithms, workflows, and data are readily available through web applications. Since each component is an independent container, the whole platform or its individual service(s) can be easily deployed at servers of different scales, ranging from a personal device (laptop, smart phone, etc.) to high performance clusters (HPC) accessed (simultaneously) by many users. Thus, MLExchange renders flexible using scenarios -- users could either access the services and resources from a remote server or run the whole platform or its individual service(s) within their local network.
LGMay 1
Machine Learning-Augmented Acceleration of Iterative Ptychographic ReconstructionBowen Zheng, Katayun Kamdin, David Shapiro et al.
Iterative ptychographic reconstruction algorithms are widely used for coherent diffractive imaging but can exhibit slow convergence under realistic experimental conditions. We propose a machine learning-augmented approach that accelerates iterative ptychographic reconstruction by introducing a learned fast-forward operator applied during reconstruction. Following an initial warm-up using standard iterations, the fast-forward operator advances the reconstruction toward a more converged state, after which conventional iterative updates are resumed. This strategy preserves the physical consistency and flexibility of established ptychographic solvers while reducing the number of iterations required for convergence. The model is trained on diverse ptychographic datasets and evaluated on experimental data acquired in a different year, demonstrating robustness and temporal generalization. Compared with conventional iterative solvers, the machine learning-augmented method achieves comparable reconstruction quality while converging faster in terms of Poisson negative log-likelihood, yielding over a two-fold reduction in wall-clock time. The approach has been integrated into an existing reconstruction pipeline and deployed in production at a synchrotron beamline, demonstrating practicality for real-time experimental operation.
IVJul 29, 2022
Artifact Identification in X-ray Diffraction Data using Machine Learning MethodsHoward Yanxon, James Weng, Hannah Parraga et al.
The in situ synchrotron high-energy X-ray powder diffraction (XRD) technique is highly utilized by researchers to analyze the crystallographic structures of materials in functional devices (e.g., battery materials) or in complex sample environments (e.g., diamond anvil cells or syntheses reactors). An atomic structure of a material can be identified by its diffraction pattern, along with detailed analysis such as Rietveld refinement which indicates how the measured structure deviates from the ideal structure (e.g., internal stresses or defects). For in situ experiments, a series of XRD images is usually collected on the same sample at different conditions (e.g., adiabatic conditions), yielding different states of matter, or simply collected continuously as a function of time to track the change of a sample over a chemical or physical process. In situ experiments are usually performed with area detectors, collecting 2D images composed of diffraction rings for ideal powders. Depending on the material's form, one may observe different characteristics other than the typical Debye Scherrer rings for a realistic sample and its environments, such as textures or preferred orientations and single crystal diffraction spots in the 2D XRD image. In this work, we present an investigation of machine learning methods for fast and reliable identification and separation of the single crystal diffraction spots in XRD images. The exclusion of artifacts during an XRD image integration process allows a precise analysis of the powder diffraction rings of interest. We observe that the gradient boosting method can consistently produce high accuracy results when it is trained with small subsets of highly diverse datasets. The method dramatically decreases the amount of time spent on identifying and separating single crystal spots in comparison to the conventional method.
COMP-PHApr 20
Nonuniform Iterative Phasing Framework and Sampling Requirements for 3D Dynamical Inversion from Coherent Surface Scattering ImagingJeffrey J. Donatelli, Miaoqi Chu, Zixi Hu et al.
Coherent surface scattering imaging (CSSI) is an emerging experimental technique uniquely suited to probing the structure of thin nanostructures. In these experiments, a specimen is placed on a substrate, and a series of X-ray diffraction patterns is collected at grazing incidence angles as the specimen is rotated. However, reconstructing the specimen's 3D structure from the data is challenging due to dynamical scattering effects induced by the experimental geometry and the lack of direct phase measurements. Specifically, the data involves nonuniformly sampled Fourier-transform values of the specimen density, and failure to effectively address this nonuniformity can lead to errors or degraded performance. Here we introduce a mathematical inversion framework that combines iterative-projection-based phasing techniques with new fast nonuniform Fourier inversion methods to efficiently reconstruct isolated 3D structures from their CSSI rotation-series data. We also analyze the theoretical properties of CSSI reconstruction to derive requirements on experimental parameters and characterize solution uniqueness. We validate our approach using CSSI data simulated from a conical Siemens star and a porous medium, demonstrating that high-resolution 3D structures can be reconstructed even in the presence of significant dynamical scattering, from data collected at as few as one or two incident angles. More broadly, the presented nonuniform reconstruction framework provides a foundation for solving challenging generalizations of the phase problem in which measurements involve nonlinear combinations of nonuniformly sampled Fourier values.
LGDec 7, 2023
Rapid detection of rare events from in situ X-ray diffraction data using machine learningWeijian Zheng, Jun-Sang Park, Peter Kenesei et al.
High-energy X-ray diffraction methods can non-destructively map the 3D microstructure and associated attributes of metallic polycrystalline engineering materials in their bulk form. These methods are often combined with external stimuli such as thermo-mechanical loading to take snapshots over time of the evolving microstructure and attributes. However, the extreme data volumes and the high costs of traditional data acquisition and reduction approaches pose a barrier to quickly extracting actionable insights and improving the temporal resolution of these snapshots. Here we present a fully automated technique capable of rapidly detecting the onset of plasticity in high-energy X-ray microscopy data. Our technique is computationally faster by at least 50 times than the traditional approaches and works for data sets that are up to 9 times sparser than a full data set. This new technique leverages self-supervised image representation learning and clustering to transform massive data into compact, semantic-rich representations of visually salient characteristics (e.g., peak shapes). These characteristics can be a rapid indicator of anomalous events such as changes in diffraction peak shapes. We anticipate that this technique will provide just-in-time actionable information to drive smarter experiments that effectively deploy multi-modal X-ray diffraction methods that span many decades of length scales.
LGSep 29, 2025
Towards generalizable deep ptychography neural networksAlbert Vong, Steven Henke, Oliver Hoidn et al.
X-ray ptychography is a data-intensive imaging technique expected to become ubiquitous at next-generation light sources delivering many-fold increases in coherent flux. The need for real-time feedback under accelerated acquisition rates motivates surrogate reconstruction models like deep neural networks, which offer orders-of-magnitude speedup over conventional methods. However, existing deep learning approaches lack robustness across diverse experimental conditions. We propose an unsupervised training workflow emphasizing probe learning by combining experimentally-measured probes with synthetic, procedurally generated objects. This probe-centric approach enables a single physics-informed neural network to reconstruct unseen experiments across multiple beamlines; among the first demonstrations of multi-probe generalization. We find probe learning is equally important as in-distribution learning; models trained using this synthetic workflow achieve reconstruction fidelity comparable to those trained exclusively on experimental data, even when changing the type of synthetic training object. The proposed approach enables training of experiment-steering models that provide real-time feedback under dynamic experimental conditions.
LGMay 28, 2021
Bridging Data Center AI Systems with Edge Computing for Actionable Information RetrievalZhengchun Liu, Ahsan Ali, Peter Kenesei et al.
Extremely high data rates at modern synchrotron and X-ray free-electron laser light source beamlines motivate the use of machine learning methods for data reduction, feature detection, and other purposes. Regardless of the application, the basic concept is the same: data collected in early stages of an experiment, data from past similar experiments, and/or data simulated for the upcoming experiment are used to train machine learning models that, in effect, learn specific characteristics of those data; these models are then used to process subsequent data more efficiently than would general-purpose models that lack knowledge of the specific dataset or data class. Thus, a key challenge is to be able to train models with sufficient rapidity that they can be deployed and used within useful timescales. We describe here how specialized data center AI (DCAI) systems can be used for this purpose through a geographically distributed workflow. Experiments show that although there are data movement cost and service overhead to use remote DCAI systems for DNN training, the turnaround time is still less than 1/30 of using a locally deploy-able GPU.