LGSep 20, 2022
Deep learning at the edge enables real-time streaming ptychographic imagingAnakha V Babu, Tao Zhou, Saugat Kandel et al.
Coherent microscopy techniques provide an unparalleled multi-scale view of materials across scientific and technological fields, from structural materials to quantum devices, from integrated circuits to biological cells. Driven by the construction of brighter sources and high-rate detectors, coherent X-ray microscopy methods like ptychography are poised to revolutionize nanoscale materials characterization. However, associated significant increases in data and compute needs mean that conventional approaches no longer suffice for recovering sample images in real-time from high-speed coherent imaging experiments. Here, we demonstrate a workflow that leverages artificial intelligence at the edge and high-performance computing to enable real-time inversion on X-ray ptychography data streamed directly from a detector at up to 2 kHz. The proposed AI-enabled workflow eliminates the sampling constraints imposed by traditional ptychography, allowing low dose imaging using orders of magnitude less data than required by traditional methods.
GRSep 2, 2025
Fidelity-preserving enhancement of ptychography with foundational text-to-image modelsMing Du, Volker Rose, Junjing Deng et al.
Ptychographic phase retrieval enables high-resolution imaging of complex samples but often suffers from artifacts such as grid pathology and multislice crosstalk, which degrade reconstructed images. We propose a plug-and-play (PnP) framework that integrates physics model-based phase retrieval with text-guided image editing using foundational diffusion models. By employing the alternating direction method of multipliers (ADMM), our approach ensures consensus between data fidelity and artifact removal subproblems, maintaining physics consistency while enhancing image quality. Artifact removal is achieved using a text-guided diffusion image editing method (LEDITS++) with a pre-trained foundational diffusion model, allowing users to specify artifacts for removal in natural language. Demonstrations on simulated and experimental datasets show significant improvements in artifact suppression and structural fidelity, validated by metrics such as peak signal-to-noise ratio (PSNR) and diffraction pattern consistency. This work highlights the combination of text-guided generative models and model-based phase retrieval algorithms as a transferable and fidelity-preserving method for high-quality diffraction imaging.
29.9AIMay 12
CVEvolve: Autonomous Algorithm Discovery for Unstructured Scientific Data ProcessingMing Du, Xiangyu Yin, Yanqi Luo et al.
Scientific data processing often requires task-specific algorithms or AI models, creating a barrier for domain scientists who need to analyze their data but may not have extensive computing or image-processing expertise. This barrier is especially pronounced when data are noisy, have a high dynamic range, are sparsely labeled, or are only loosely specified. We introduce CVEvolve, an autonomous agentic harness with a zero-code interface for scientific data-processing algorithm discovery. CVEvolve combines a multi-round search strategy with tools for code execution, evaluation implementation, history management, holdout testing, and optional inspection of scientific data and visual outputs. The search alternates between discovery and improvement actions, and uses lineage-aware stochastic candidate sampling to balance exploration and exploitation. We demonstrate CVEvolve on x-ray fluorescence microscopy image registration, Bragg peak detection, and high-energy diffraction microscopy image segmentation. Across these tasks, CVEvolve discovers algorithms that improve over baseline methods, while holdout test tracking helps identify candidates that generalize better than later over-optimized alternatives. These results show that zero-code, autonomous LLM-powered algorithm development can help domain scientists turn unstructured scientific image data into practical algorithms and downstream scientific discoveries.
AIFeb 17
EAA: Automating materials characterization with vision language model agentsMing Du, Yanqi Luo, Srutarshi Banerjee et al.
We present Experiment Automation Agents (EAA), a vision-language-model-driven agentic system designed to automate complex experimental microscopy workflows. EAA integrates multimodal reasoning, tool-augmented action, and optional long-term memory to support both autonomous procedures and interactive user-guided measurements. Built on a flexible task-manager architecture, the system enables workflows ranging from fully agent-driven automation to logic-defined routines that embed localized LLM queries. EAA further provides a modern tool ecosystem with two-way compatibility for Model Context Protocol (MCP), allowing instrument-control tools to be consumed or served across applications. We demonstrate EAA at an imaging beamline at the Advanced Photon Source, including automated zone plate focusing, natural language-described feature search, and interactive data acquisition. These results illustrate how vision-capable agents can enhance beamline efficiency, reduce operational burden, and lower the expertise barrier for users.
APP-PHApr 23, 2025
Demonstration of an AI-driven workflow for dynamic x-ray spectroscopyMing Du, Mark Wolfman, Chengjun Sun et al.
X-ray absorption near edge structure (XANES) spectroscopy is a powerful technique for characterizing the chemical state and symmetry of individual elements within materials, but requires collecting data at many energy points which can be time-consuming. While adaptive sampling methods exist for efficiently collecting spectroscopic data, they often lack domain-specific knowledge about XANES spectra structure. Here we demonstrate a knowledge-injected Bayesian optimization approach for adaptive XANES data collection that incorporates understanding of spectral features like absorption edges and pre-edge peaks. We show this method accurately reconstructs the absorption edge of XANES spectra using only 15-20% of the measurement points typically needed for conventional sampling, while maintaining the ability to determine the x-ray energy of the sharp peak after absorption edge with errors less than 0.03 eV, the absorption edge with errors less than 0.1 eV; and overall root-mean-square errors less than 0.005 compared to compared to traditionally sampled spectra. Our experiments on battery materials and catalysts demonstrate the method's effectiveness for both static and dynamic XANES measurements, improving data collection efficiency and enabling better time resolution for tracking chemical changes. This approach advances the degree of automation in XANES experiments reducing the common errors of under- or over-sampling points in near the absorption edge and enabling dynamic experiments that require high temporal resolution or limited measurement time.
LGJul 18, 2025
DONUT: Physics-aware Machine Learning for Real-time X-ray Nanodiffraction AnalysisAileen Luo, Tao Zhou, Ming Du et al.
Coherent X-ray scattering techniques are critical for investigating the fundamental structural properties of materials at the nanoscale. While advancements have made these experiments more accessible, real-time analysis remains a significant bottleneck, often hindered by artifacts and computational demands. In scanning X-ray nanodiffraction microscopy, which is widely used to spatially resolve structural heterogeneities, this challenge is compounded by the convolution of the divergent beam with the sample's local structure. To address this, we introduce DONUT (Diffraction with Optics for Nanobeam by Unsupervised Training), a physics-aware neural network designed for the rapid and automated analysis of nanobeam diffraction data. By incorporating a differentiable geometric diffraction model directly into its architecture, DONUT learns to predict crystal lattice strain and orientation in real-time. Crucially, this is achieved without reliance on labeled datasets or pre-training, overcoming a fundamental limitation for supervised machine learning in X-ray science. We demonstrate experimentally that DONUT accurately extracts all features within the data over 200 times more efficiently than conventional fitting methods.
APP-PHSep 28, 2021
AutoPhaseNN: Unsupervised Physics-aware Deep Learning of 3D Nanoscale Bragg Coherent Diffraction ImagingYudong Yao, Henry Chan, Subramanian Sankaranarayanan et al.
The problem of phase retrieval, or the algorithmic recovery of lost phase information from measured intensity alone, underlies various imaging methods from astronomy to nanoscale imaging. Traditional methods of phase retrieval are iterative in nature, and are therefore computationally expensive and time consuming. More recently, deep learning (DL) models have been developed to either provide learned priors to iterative phase retrieval or in some cases completely replace phase retrieval with networks that learn to recover the lost phase information from measured intensity alone. However, such models require vast amounts of labeled data, which can only be obtained through simulation or performing computationally prohibitive phase retrieval on hundreds of or even thousands of experimental datasets. Using a 3D nanoscale X-ray imaging modality (Bragg Coherent Diffraction Imaging or BCDI) as a representative technique, we demonstrate AutoPhaseNN, a DL-based approach which learns to solve the phase problem without labeled data. By incorporating the physics of the imaging technique into the DL model during training, AutoPhaseNN learns to invert 3D BCDI data from reciprocal space to real space in a single shot without ever being shown real space images. Once trained, AutoPhaseNN is about one hundred times faster than traditional iterative phase retrieval methods while providing comparable image quality.
IVJun 16, 2020
Real-time 3D Nanoscale Coherent Imaging via Physics-aware Deep LearningHenry Chan, Youssef S. G. Nashed, Saugat Kandel et al.
Phase retrieval, the problem of recovering lost phase information from measured intensity alone, is an inverse problem that is widely faced in various imaging modalities ranging from astronomy to nanoscale imaging. The current process of phase recovery is iterative in nature. As a result, the image formation is time-consuming and computationally expensive, precluding real-time imaging. Here, we use 3D nanoscale X-ray imaging as a representative example to develop a deep learning model to address this phase retrieval problem. We introduce 3D-CDI-NN, a deep convolutional neural network and differential programming framework trained to predict 3D structure and strain solely from input 3D X-ray coherent scattering data. Our networks are designed to be "physics-aware" in multiple aspects; in that the physics of x-ray scattering process is explicitly enforced in the training of the network, and the training data are drawn from atomistic simulations that are representative of the physics of the material. We further refine the neural network prediction through a physics-based optimization procedure to enable maximum accuracy at lowest computational cost. 3D-CDI-NN can invert a 3D coherent diffraction pattern to real-space structure and strain hundreds of times faster than traditional iterative phase retrieval methods, with negligible loss in accuracy. Our integrated machine learning and differential programming solution to the phase retrieval problem is broadly applicable across inverse problems in other application areas.
IVApr 15, 2020
Real-time sparse-sampled Ptychographic imaging through deep neural networksMathew J. Cherukara, Tao Zhou, Youssef Nashed et al.
Ptychography has rapidly grown in the fields of X-ray and electron imaging for its unprecedented ability to achieve nano or atomic scale resolution while simultaneously retrieving chemical or magnetic information from a sample. A ptychographic reconstruction is achieved by means of solving a complex inverse problem that imposes constraints both on the acquisition and on the analysis of the data, which typically precludes real-time imaging due to computational cost involved in solving this inverse problem. In this work we propose PtychoNN, a novel approach to solve the ptychography reconstruction problem based on deep convolutional neural networks. We demonstrate how the proposed method can be used to predict real-space structure and phase at each scan point solely from the corresponding far-field diffraction data. The presented results demonstrate how PtychoNN can effectively be used on experimental data, being able to generate high quality reconstructions of a sample up to hundreds of times faster than state-of-the-art ptychography reconstruction solutions once trained. By surpassing the typical constraints of iterative model-based methods, we can significantly relax the data acquisition sampling conditions and produce equally satisfactory reconstructions. Besides drastically accelerating acquisition and analysis, this capability can enable new imaging scenarios that were not possible before, in cases of dose sensitive, dynamic and extremely voluminous samples.
CVJun 7, 2018
Real-time coherent diffraction inversion using deep generative networksMathew J. Cherukara, Youssef S. G. Nashed, Ross J. Harder
Phase retrieval, or the process of recovering phase information in reciprocal space to reconstruct images from measured intensity alone, is the underlying basis to a variety of imaging applications including coherent diffraction imaging (CDI). Typical phase retrieval algorithms are iterative in nature, and hence, are time-consuming and computationally expensive, precluding real-time imaging. Furthermore, iterative phase retrieval algorithms struggle to converge to the correct solution especially in the presence of strong phase structures. In this work, we demonstrate the training and testing of CDI NN, a pair of deep deconvolutional networks trained to predict structure and phase in real space of a 2D object from its corresponding far-field diffraction intensities alone. Once trained, CDI NN can invert a diffraction pattern to an image within a few milliseconds of compute time on a standard desktop machine, opening the door to real-time imaging.