Stefan Sandfeld

MTRL-SCI
h-index11
14papers
88citations
Novelty34%
AI Score41

14 Papers

CVJul 12, 2023
Deep Learning of Crystalline Defects from TEM images: A Solution for the Problem of "Never Enough Training Data"

Kishan Govind, Daniela Oliveros, Antonin Dlouhy et al.

Crystalline defects, such as line-like dislocations, play an important role for the performance and reliability of many metallic devices. Their interaction and evolution still poses a multitude of open questions to materials science and materials physics. In-situ TEM experiments can provide important insights into how dislocations behave and move. During such experiments, the dislocation microstructure is captured in form of videos. The analysis of individual video frames can provide useful insights but is limited by the capabilities of automated identification, digitization, and quantitative extraction of the dislocations as curved objects. The vast amount of data also makes manual annotation very time consuming, thereby limiting the use of Deep Learning-based, automated image analysis and segmentation of the dislocation microstructure. In this work, a parametric model for generating synthetic training data for segmentation of dislocations is developed. Even though domain scientists might dismiss synthetic training images sometimes as too artificial, our findings show that they can result in superior performance, particularly regarding the generalizing of the Deep Learning models with respect to different microstructures and imaging conditions. Additionally, we propose an enhanced deep learning method optimized for segmenting overlapping or intersecting dislocation lines. Upon testing this framework on four distinct real datasets, we find that our synthetic training data are able to yield high-quality results also on real images-even more so if fine-tune on a few real images was done.

MTRL-SCIJun 16, 2022
Automated analysis of continuum fields from atomistic simulations using statistical machine learning

Aruna Prakash, Stefan Sandfeld

Atomistic simulations of the molecular dynamics/statics kind are regularly used to study small scale plasticity. Contemporary simulations are performed with tens to hundreds of millions of atoms, with snapshots of these configurations written out at regular intervals for further analysis. Continuum scale constitutive models for material behavior can benefit from information on the atomic scale, in particular in terms of the deformation mechanisms, the accommodation of the total strain and partitioning of stress and strain fields in individual grains. In this work we develop a methodology using statistical data mining and machine learning algorithms to automate the analysis of continuum field variables in atomistic simulations. We focus on three important field variables: total strain, elastic strain and microrotation. Our results show that the elastic strain in individual grains exhibits a unimodal log-normal distribution, whilst the total strain and microrotation fields evidence a multimodal distribution. The peaks in the distribution of total strain are identified with a Gaussian mixture model and methods to circumvent overfitting problems are presented. Subsequently, we evaluate the identified peaks in terms of deformation mechanisms in a grain, which e.g., helps to quantify the strain for which individual deformation mechanisms are responsible. The overall statistics of the distributions over all grains are an important input for higher scale models, which ultimately also helps to be able to quantitatively discuss the implications for information transfer to phenomenological models.

CVSep 7, 2023
Instance Segmentation of Dislocations in TEM Images

Karina Ruzaeva, Kishan Govind, Marc Legros et al.

Quantitative Transmission Electron Microscopy (TEM) during in-situ straining experiment is able to reveal the motion of dislocations -- linear defects in the crystal lattice of metals. In the domain of materials science, the knowledge about the location and movement of dislocations is important for creating novel materials with superior properties. A long-standing problem, however, is to identify the position and extract the shape of dislocations, which would ultimately help to create a digital twin of such materials. In this work, we quantitatively compare state-of-the-art instance segmentation methods, including Mask R-CNN and YOLOv8. The dislocation masks as the results of the instance segmentation are converted to mathematical lines, enabling quantitative analysis of dislocation length and geometry -- important information for the domain scientist, which we then propose to include as a novel length-aware quality metric for estimating the network performance. Our segmentation pipeline shows a high accuracy suitable for all domain-specific, further post-processing. Additionally, our physics-based metric turns out to perform much more consistently than typically used pixel-wise metrics.

LGSep 12, 2023
Unsupervised Learning of Nanoindentation Data to Infer Microstructural Details of Complex Materials

Chen Zhang, Clémence Bos, Stefan Sandfeld et al.

In this study, Cu-Cr composites were studied by nanoindentation. Arrays of indents were placed over large areas of the samples resulting in datasets consisting of several hundred measurements of Young's modulus and hardness at varying indentation depths. The unsupervised learning technique, Gaussian mixture model, was employed to analyze the data, which helped to determine the number of "mechanical phases" and the respective mechanical properties. Additionally, a cross-validation approach was introduced to infer whether the data quantity was adequate and to suggest the amount of data required for reliable predictions -- one of the often encountered but difficult to resolve issues in machine learning of materials science problems.

LGJan 13
Out-of-distribution generalization of deep-learning surrogates for 2D PDE-generated dynamics in the small-data regime

Binh Duong Nguyen, Stefan Sandfeld

Partial differential equations (PDEs) are a central tool for modeling the dynamics of physical, engineering, and materials systems, but high-fidelity simulations are often computationally expensive. At the same time, many scientific applications can be viewed as the evolution of spatially distributed fields, making data-driven forecasting of such fields a core task in scientific machine learning. In this work we study autoregressive deep-learning surrogates for two-dimensional PDE dynamics on periodic domains, focusing on generalization to out-of-distribution initial conditions within a fixed PDE and parameter regime and on strict small-data settings with at most $\mathcal{O}(10^2)$ simulated trajectories per system. We introduce a multi-channel U-Net [...], evaluate it on five qualitatively different PDE families and compare it to ViT, AFNO, PDE-Transformer, and KAN-UNet under a common training setup. Across all datasets, me-UNet matches or outperforms these more complex architectures in terms of field-space error, spectral similarity, and physics-based metrics for in-distribution rollouts, while requiring substantially less training time. It also generalizes qualitatively to unseen initial conditions with as few as $\approx 20$ training simulations. A data-efficiency study and Grad-CAM analysis further suggest that, in small-data periodic 2D PDE settings, convolutional architectures with inductive biases aligned to locality and periodic boundary conditions remain strong contenders for accurate and moderately out-of-distribution-robust surrogate modeling.

DBMar 31
Ontology-based knowledge graph infrastructure for interoperable atomistic simulation data

Abril Azocar Guzman, Sarath Menon, Tilmann Hickel et al.

The reuse of atomistic simulation data is often limited by heterogeneous formats, incomplete metadata, and a lack of standardized representations of workflows and provenance. Here we present an ontology-based infrastructure for representing and integrating atomistic simulation data as a knowledge graph. The approach combines domain ontologies with a software framework that enables data capture both from existing datasets and directly from simulation workflows at the point of generation. Heterogeneous data from multiple sources are normalized into a common, ontology-aligned representation, enabling consistent querying and analysis across datasets. We demonstrate these capabilities through the integration of grain boundary data, cross-dataset analysis of material properties, and extraction of derived thermodynamic quantities from existing simulations. In addition, workflows are represented in a machine-readable form, enabling both forward provenance tracking and partial reconstruction of computational procedures. The resulting knowledge graph contains over 750,000 triples describing nearly 8,000 computational samples. This work provides a practical framework for improving the findability, interoperability, and reuse of atomistic simulation data.

LGNov 19, 2023
A Generative Model for Accelerated Inverse Modelling Using a Novel Embedding for Continuous Variables

Sébastien Bompas, Stefan Sandfeld

In materials science, the challenge of rapid prototyping materials with desired properties often involves extensive experimentation to find suitable microstructures. Additionally, finding microstructures for given properties is typically an ill-posed problem where multiple solutions may exist. Using generative machine learning models can be a viable solution which also reduces the computational cost. This comes with new challenges because, e.g., a continuous property variable as conditioning input to the model is required. We investigate the shortcomings of an existing method and compare this to a novel embedding strategy for generative models that is based on the binary representation of floating point numbers. This eliminates the need for normalization, preserves information, and creates a versatile embedding space for conditioning the generative model. This technique can be applied to condition a network on any number, to provide fine control over generated microstructure images, thereby contributing to accelerated materials design.

MTRL-SCISep 13, 2023
Modeling Dislocation Dynamics Data Using Semantic Web Technologies

Ahmad Zainul Ihsan, Said Fathalla, Stefan Sandfeld

Research in the field of Materials Science and Engineering focuses on the design, synthesis, properties, and performance of materials. An important class of materials that is widely investigated are crystalline materials, including metals and semiconductors. Crystalline material typically contains a distinct type of defect called "dislocation". This defect significantly affects various material properties, including strength, fracture toughness, and ductility. Researchers have devoted a significant effort in recent years to understanding dislocation behavior through experimental characterization techniques and simulations, e.g., dislocation dynamics simulations. This paper presents how data from dislocation dynamics simulations can be modeled using semantic web technologies through annotating data with ontologies. We extend the already existing Dislocation Ontology by adding missing concepts and aligning it with two other domain-related ontologies (i.e., the Elementary Multi-perspective Material Ontology and the Materials Design Ontology) allowing for representing the dislocation simulation data efficiently. Moreover, we show a real-world use case by representing the discrete dislocation dynamics data as a knowledge graph (DisLocKG) that illustrates the relationship between them. We also developed a SPARQL endpoint that brings extensive flexibility to query DisLocKG.

MTRL-SCIDec 22, 2023
Machine learning for structure-guided materials and process design

Lukas Morand, Tarek Iraki, Johannes Dornheim et al.

In recent years, there has been a growing interest in accelerated materials innovation in the context of the process-structure-property chain. In this regard, it is essential to take into account manufacturing processes and tailor materials design approaches to support downstream process design approaches. As a major step into this direction, we present a holistic and generic optimization approach that covers the entire process-structure-property chain in materials engineering. Our approach specifically employs machine learning to address two critical identification problems: a materials design problem, which involves identifying near-optimal material microstructures that exhibit desired properties, and a process design problem that is to find an optimal processing path to manufacture these microstructures. Both identification problems are typically ill-posed, which presents a significant challenge for solution approaches. However, the non-unique nature of these problems offers an important advantage for processing: By having several target microstructures that perform similarly well, processes can be efficiently guided towards manufacturing the best reachable microstructure. The functionality of the approach is demonstrated at manufacturing crystallographic textures with desired properties in a simulated metal forming process.

CVFeb 28, 2024
Self-Supervised Learning with Generative Adversarial Networks for Electron Microscopy

Bashir Kazimi, Karina Ruzaeva, Stefan Sandfeld

In this work, we explore the potential of self-supervised learning with Generative Adversarial Networks (GANs) for electron microscopy datasets. We show how self-supervised pretraining facilitates efficient fine-tuning for a spectrum of downstream tasks, including semantic segmentation, denoising, noise \& background removal, and super-resolution. Experimentation with varying model complexities and receptive field sizes reveals the remarkable phenomenon that fine-tuned models of lower complexity consistently outperform more complex models with random weight initialization. We demonstrate the versatility of self-supervised pretraining across various downstream tasks in the context of electron microscopy, allowing faster convergence and better performance. We conclude that self-supervised pretraining serves as a powerful catalyst, being especially advantageous when limited annotated data are available and efficient scaling of computational cost is important.

MTRL-SCIJan 4, 2024
DISO: A Domain Ontology for Modeling Dislocations in Crystalline Materials

Ahmad Zainul Ihsan, Said Fathalla, Stefan Sandfeld

Crystalline materials, such as metals and semiconductors, nearly always contain a special defect type called dislocation. This defect decisively determines many important material properties, e.g., strength, fracture toughness, or ductility. Over the past years, significant effort has been put into understanding dislocation behavior across different length scales via experimental characterization techniques and simulations. This paper introduces the dislocation ontology (DISO), which defines the concepts and relationships related to linear defects in crystalline materials. We developed DISO using a top-down approach in which we start defining the most general concepts in the dislocation domain and subsequent specialization of them. DISO is published through a persistent URL following W3C best practices for publishing Linked Data. Two potential use cases for DISO are presented to illustrate its usefulness in the dislocation dynamics domain. The evaluation of the ontology is performed in two directions, evaluating the success of the ontology in modeling a real-world domain and the richness of the ontology.

CVFeb 20, 2024
Combining unsupervised and supervised learning in microscopy enables defect analysis of a full 4H-SiC wafer

Binh Duong Nguyen, Johannes Steiner, Peter Wellmann et al.

Detecting and analyzing various defect types in semiconductor materials is an important prerequisite for understanding the underlying mechanisms as well as tailoring the production processes. Analysis of microscopy images that reveal defects typically requires image analysis tasks such as segmentation and object detection. With the permanently increasing amount of data that is produced by experiments, handling these tasks manually becomes more and more impossible. In this work, we combine various image analysis and data mining techniques for creating a robust and accurate, automated image analysis pipeline. This allows for extracting the type and position of all defects in a microscopy image of a KOH-etched 4H-SiC wafer that was stitched together from approximately 40,000 individual images.

MTRL-SCIFeb 1
Towards knowledge-based workflows: a semantic approach to atomistic simulations for mechanical and thermodynamic properties

Abril Azocar Guzman, Hoang-Thien Luu, Sarath Menon et al.

Mechanical and thermodynamic properties, including the influence of crystal defects, are critical for evaluating materials in engineering applications. Molecular dynamics simulations provide valuable insight into these mechanisms at the atomic scale. However, current practice often relies on fragmented scripts with inconsistent metadata and limited provenance, which hinders reproducibility, interoperability, and reuse. FAIR data principles and workflow-based approaches offer a path to address these limitations. We present reusable atomistic workflows that incorporate metadata annotation aligned with application ontologies, enabling automatic provenance capture and FAIR-compliant data outputs. The workflows cover key mechanical and thermodynamic quantities, including equation of state, elastic tensors, mechanical loading, thermal properties, defect formation energies, and nanoindentation. We demonstrate validation of structure-property relations such as the Hall-Petch effect and show that the workflows can be reused across different interatomic potentials and materials within a coherent semantic framework. The approach provides AI-ready simulation data, supports emerging agentic AI workflows, and establishes a generalizable blueprint for knowledge-based mechanical and thermodynamic simulations.

LGSep 1, 2023
Efficient Surrogate Models for Materials Science Simulations: Machine Learning-based Prediction of Microstructure Properties

Binh Duong Nguyen, Pavlo Potapenko, Aytekin Dermici et al.

Determining, understanding, and predicting the so-called structure-property relation is an important task in many scientific disciplines, such as chemistry, biology, meteorology, physics, engineering, and materials science. Structure refers to the spatial distribution of, e.g., substances, material, or matter in general, while property is a resulting characteristic that usually depends in a non-trivial way on spatial details of the structure. Traditionally, forward simulations models have been used for such tasks. Recently, several machine learning algorithms have been applied in these scientific fields to enhance and accelerate simulation models or as surrogate models. In this work, we develop and investigate the applications of six machine learning techniques based on two different datasets from the domain of materials science: data from a two-dimensional Ising model for predicting the formation of magnetic domains and data representing the evolution of dual-phase microstructures from the Cahn-Hilliard model. We analyze the accuracy and robustness of all models and elucidate the reasons for the differences in their performances. The impact of including domain knowledge through tailored features is studied, and general recommendations based on the availability and quality of training data are derived from this.