IVSep 19, 2022
Weak-signal extraction enabled by deep-neural-network denoising of diffraction dataJens Oppliger, M. Michael Denner, Julia Küspert et al.
Removal or cancellation of noise has wide-spread applications for imaging and acoustics. In every-day-life applications, denoising may even include generative aspects, which are unfaithful to the ground truth. For scientific use, however, denoising must reproduce the ground truth accurately. Here, we show how data can be denoised via a deep convolutional neural network such that weak signals appear with quantitative accuracy. In particular, we study X-ray diffraction on crystalline materials. We demonstrate that weak signals stemming from charge ordering, insignificant in the noisy data, become visible and accurate in the denoised data. This success is enabled by supervised training of a deep neural network with pairs of measured low- and high-noise data. We demonstrate that using artificial noise does not yield such quantitatively accurate results. Our approach thus illustrates a practical strategy for noise filtering that can be applied to challenging acquisition problems.
MTRL-SCIJul 2, 2020Code
Learning-based Defect Recognition for Quasi-Periodic Microscope ImagesNik Dennler, Antonio Foncubierta-Rodriguez, Titus Neupert et al.
Controlling crystalline material defects is crucial, as they affect properties of the material that may be detrimental or beneficial for the final performance of a device. Defect analysis on the sub-nanometer scale is enabled by high-resolution (scanning) transmission electron microscopy [HR(S)TEM], where the identification of defects is currently carried out based on human expertise. However, the process is tedious, highly time consuming and, in some cases, yields ambiguous results. Here we propose a semi-supervised machine learning method that assists in the detection of lattice defects from atomic resolution microscope images. It involves a convolutional neural network that classifies image patches as defective or non-defective, a graph-based heuristic that chooses one non-defective patch as a model, and finally an automatically generated convolutional filter bank, which highlights symmetry breaking such as stacking faults, twin defects and grain boundaries. Additionally, we suggest a variance filter to segment amorphous regions and beam defects. The algorithm is tested on III-V/Si crystalline materials and successfully evaluated against different metrics, showing promising results even for extremely small training data sets. By combining the data-driven classification generality, robustness and speed of deep learning with the effectiveness of image filters in segmenting faulty symmetry arrangements, we provide a valuable open-source tool to the microscopist community that can streamline future HR(S)TEM analyses of crystalline materials.
LGOct 6, 2025
CMT-Benchmark: A Benchmark for Condensed Matter Theory Built by Expert ResearchersHaining Pan, James V. Roggeveen, Erez Berg et al.
Large language models (LLMs) have shown remarkable progress in coding and math problem-solving, but evaluation on advanced research-level problems in hard sciences remains scarce. To fill this gap, we present CMT-Benchmark, a dataset of 50 problems covering condensed matter theory (CMT) at the level of an expert researcher. Topics span analytical and computational approaches in quantum many-body, and classical statistical mechanics. The dataset was designed and verified by a panel of expert researchers from around the world. We built the dataset through a collaborative environment that challenges the panel to write and refine problems they would want a research assistant to solve, including Hartree-Fock, exact diagonalization, quantum/variational Monte Carlo, density matrix renormalization group (DMRG), quantum/classical statistical mechanics, and model building. We evaluate LLMs by programmatically checking solutions against expert-supplied ground truth. We developed machine-grading, including symbolic handling of non-commuting operators via normal ordering. They generalize across tasks too. Our evaluations show that frontier models struggle with all of the problems in the dataset, highlighting a gap in the physical reasoning skills of current LLMs. Notably, experts identified strategies for creating increasingly difficult problems by interacting with the LLMs and exploiting common failure modes. The best model, GPT5, solves 30\% of the problems; average across 17 models (GPT, Gemini, Claude, DeepSeek, Llama) is 11.4$\pm$2.1\%. Moreover, 18 problems are solved by none of the 17 models, and 26 by at most one. These unsolved problems span Quantum Monte Carlo, Variational Monte Carlo, and DMRG. Answers sometimes violate fundamental symmetries or have unphysical scaling dimensions. We believe this benchmark will guide development toward capable AI research assistants and tutors.
COMP-PHFeb 8, 2021
Introduction to Machine Learning for the SciencesTitus Neupert, Mark H Fischer, Eliska Greplova et al.
This is an introductory machine-learning course specifically developed with STEM students in mind. Our goal is to provide the interested reader with the basics to employ machine learning in their own projects and to familiarize themself with the terminology as a foundation for further reading of the relevant literature. In these lecture notes, we discuss supervised, unsupervised, and reinforcement learning. The notes start with an exposition of machine learning methods without neural networks, such as principle component analysis, t-SNE, clustering, as well as linear regression and linear classifiers. We continue with an introduction to both basic and advanced neural-network structures such as dense feed-forward and conventional neural networks, recurrent neural networks, restricted Boltzmann machines, (variational) autoencoders, generative adversarial networks. Questions of interpretability are discussed for latent-space representations and using the examples of dreaming and adversarial attacks. The final section is dedicated to reinforcement learning, where we introduce basic notions of value functions and policy learning.