MTRL-SCIJun 3, 2023
Machine learning enabled experimental design and parameter estimation for ultrafast spin dynamicsZhantao Chen, Cheng Peng, Alexander N. Petsch et al.
Advanced experimental measurements are crucial for driving theoretical developments and unveiling novel phenomena in condensed matter and material physics, which often suffer from the scarcity of facility resources and increasing complexities. To address the limitations, we introduce a methodology that combines machine learning with Bayesian optimal experimental design (BOED), exemplified with x-ray photon fluctuation spectroscopy (XPFS) measurements for spin fluctuations. Our method employs a neural network model for large-scale spin dynamics simulations for precise distribution and utility calculations in BOED. The capability of automatic differentiation from the neural network model is further leveraged for more robust and accurate parameter estimation. Our numerical benchmarks demonstrate the superior performance of our method in guiding XPFS experiments, predicting model parameters, and yielding more informative measurements within limited experimental time. Although focusing on XPFS and spin fluctuations, our method can be adapted to other experiments, facilitating more efficient data collection and accelerating scientific discoveries.
LGFeb 14, 2023
SpeckleNN: A unified embedding for real-time speckle pattern classification in X-ray single-particle imaging with limited labeled examplesCong Wang, Eric Florin, Hsing-Yin Chang et al.
With X-ray free-electron lasers (XFELs), it is possible to determine the three-dimensional structure of noncrystalline nanoscale particles using X-ray single-particle imaging (SPI) techniques at room temperature. Classifying SPI scattering patterns, or "speckles", to extract single hits that are needed for real-time vetoing and three-dimensional reconstruction poses a challenge for high data rate facilities like European XFEL and LCLS-II-HE. Here, we introduce SpeckleNN, a unified embedding model for real-time speckle pattern classification with limited labeled examples that can scale linearly with dataset size. Trained with twin neural networks, SpeckleNN maps speckle patterns to a unified embedding vector space, where similarity is measured by Euclidean distance. We highlight its few-shot classification capability on new never-seen samples and its robust performance despite only tens of labels per classification category even in the presence of substantial missing detector areas. Without the need for excessive manual labeling or even a full detector image, our classification method offers a great solution for real-time high-throughput SPI experiments.
INS-DETMar 24, 2023
PeakNet: An Autonomous Bragg Peak Finder with Deep Neural NetworksCong Wang, Po-Nan Li, Jana Thayer et al.
Serial crystallography at X-ray free electron laser (XFEL) and synchrotron facilities has experienced tremendous progress in recent times enabling novel scientific investigations into macromolecular structures and molecular processes. However, these experiments generate a significant amount of data posing computational challenges in data reduction and real-time feedback. Bragg peak finding algorithm is used to identify useful images and also provide real-time feedback about hit-rate and resolution. Shot-to-shot intensity fluctuations and strong background scattering from buffer solution, injection nozzle and other shielding materials make this a time-consuming optimization problem. Here, we present PeakNet, an autonomous Bragg peak finder that utilizes deep neural networks. The development of this system 1) eliminates the need for manual algorithm parameter tuning, 2) reduces false-positive peaks by adjusting to shot-to-shot variations in strong background scattering in real-time, 3) eliminates the laborious task of manually creating bad pixel masks and the need to store these masks per event since these can be regenerated on demand. PeakNet also exhibits exceptional runtime efficiency, processing a 1920-by-1920 pixel image around 90 ms on an NVIDIA 1080 Ti GPU, with the potential for further enhancements through parallelized analysis or GPU stream processing. PeakNet is well-suited for expert-level real-time serial crystallography data analysis at high data rates.
CVNov 28, 2023
Augmenting x-ray single particle imaging reconstruction with self-supervised machine learningZhantao Chen, Cong Wang, Mingye Gao et al.
The development of X-ray Free Electron Lasers (XFELs) has opened numerous opportunities to probe atomic structure and ultrafast dynamics of various materials. Single Particle Imaging (SPI) with XFELs enables the investigation of biological particles in their natural physiological states with unparalleled temporal resolution, while circumventing the need for cryogenic conditions or crystallization. However, reconstructing real-space structures from reciprocal-space x-ray diffraction data is highly challenging due to the absence of phase and orientation information, which is further complicated by weak scattering signals and considerable fluctuations in the number of photons per pulse. In this work, we present an end-to-end, self-supervised machine learning approach to recover particle orientations and estimate reciprocal space intensities from diffraction images only. Our method demonstrates great robustness under demanding experimental conditions with significantly enhanced reconstruction capabilities compared with conventional algorithms, and signifies a paradigm shift in SPI as currently practiced at XFELs.
LGSep 19, 2025
Detail Across Scales: Multi-Scale Enhancement for Full Spectrum Neural RepresentationsYuan Ni, Zhantao Chen, Cheng Peng et al.
Implicit neural representations (INRs) have emerged as a compact and parametric alternative to discrete array-based data representations, encoding information directly in neural network weights to enable resolution-independent representation and memory efficiency. However, existing INR approaches, when constrained to compact network sizes, struggle to faithfully represent the multi-scale structures, high-frequency information, and fine textures that characterize the majority of scientific datasets. To address this limitation, we propose WIEN-INR, a wavelet-informed implicit neural representation that distributes modeling across different resolution scales and employs a specialized kernel network at the finest scale to recover subtle details. This multi-scale architecture allows for the use of smaller networks to retain the full spectrum of information while preserving the training efficiency and reducing storage cost. Through extensive experiments on diverse scientific datasets spanning different scales and structural complexities, WIEN-INR achieves superior reconstruction fidelity while maintaining a compact model size. These results demonstrate WIEN-INR as a practical neural representation framework for high-fidelity scientific data encoding, extending the applicability of INRs to domains where efficient preservation of fine detail is essential.
COMP-PHSep 11, 2021
Scaling and Acceleration of Three-dimensional Structure Determination for Single-Particle Imaging Experiments with SpiniFELHsing-Yin Chang, Elliott Slaughter, Seema Mirchandaney et al.
The Linac Coherent Light Source (LCLS) is an X- ray free electron laser (XFEL) facility enabling the study of the structure and dynamics of single macromolecules. A major upgrade will bring the repetition rate of the X-ray source from 120 to 1 million pulses per second. Exascale high performance computing (HPC) capabilities will be required to process the corresponding data rates. We present SpiniFEL, an application used for structure determination of proteins from single-particle imaging (SPI) experiments. An emerging technique for imaging individual proteins and other large molecular complexes by outrunning radiation damage, SPI breaks free from the need for crystallization (which is difficult for some proteins) and allows for imaging molecular dynamics at near ambient conditions. SpiniFEL is being developed to run on supercomputers in near real-time while an experiment is taking place, so that the feedback about the data can guide the data collection strategy. We describe here how we reformulated the mathematical framework for parallelizable implementation and accelerated the most compute intensive parts of the application. We also describe the use of Pygion, a Python interface for the Legion task-based programming model and compare to our existing MPI+GPU implementation.