Alireza Doostan

LG
h-index55
33papers
370citations
Novelty48%
AI Score54

33 Papers

NAJun 10, 2010
A non-adapted sparse approximation of PDEs with stochastic inputs

Alireza Doostan, Houman Owhadi

We propose a method for the approximation of solutions of PDEs with stochastic coefficients based on the direct, i.e., non-adapted, sampling of solutions. This sampling can be done by using any legacy code for the deterministic problem as a black box. The method converges in probability (with probabilistic error bounds) as a consequence of sparsity and a concentration of measure phenomenon on the empirical correlation between samples. We show that the method is well suited for truly high-dimensional problems (with slow decay in the spectrum).

OCNov 2, 2019
Topology Optimization under Uncertainty using a Stochastic Gradient-based Approach

Subhayan De, Jerrad Hampton, Kurt Maute et al.

Topology optimization under uncertainty (TOuU) often defines objectives and constraints by statistical moments of geometric and physical quantities of interest. Most traditional TOuU methods use gradient-based optimization algorithms and rely on accurate estimates of the statistical moments and their gradients, e.g., via adjoint calculations. When the number of uncertain inputs is large or the quantities of interest exhibit large variability, a large number of adjoint (and/or forward) solves may be required to ensure the accuracy of these gradients. The optimization procedure itself often requires a large number of iterations, which may render TOuU computationally expensive, if not infeasible. To tackle this difficulty, we here propose an optimization approach that generates a stochastic approximation of the objective, constraints, and their gradients via a small number of adjoint (and/or forward) solves, per iteration. A statistically independent (stochastic) approximation of these quantities is generated at each optimization iteration. The total cost of this approach is only a small factor larger than that of the corresponding deterministic TO problem. We incorporate the stochastic approximation of objective, constraints and their design sensitivities into two classes of optimization algorithms. First, we investigate the stochastic gradient descent (SGD) method and a number of its variants, which have been successfully applied to large-scale optimization problems for machine learning. Second, we study the use of the proposed stochastic approximation approach within conventional nonlinear programming methods, focusing on the Globally Convergent Method of Moving Asymptotes (GCMMA). The performance of these algorithms is investigated with structural design optimization problems utilizing a Solid Isotropic Material with Penalization (SIMP), as well as an explicit level set method.

COMP-PHMar 1, 2017
Time-dependent global sensitivity analysis with active subspaces for a lithium ion battery model

Paul G. Constantine, Alireza Doostan

Renewable energy researchers use computer simulation to aid the design of lithium ion storage devices. The underlying models contain several physical input parameters that affect model predictions. Effective design and analysis must understand the sensitivity of model predictions to changes in model parameters, but global sensitivity analyses become increasingly challenging as the number of input parameters increases. Active subspaces are part of an emerging set of tools for discovering and exploiting low-dimensional structures in the map from high-dimensional inputs to model outputs. We extend linear and quadratic model-based heuristic for active sub- space discovery to time-dependent processes and apply the resulting technique to a lithium ion battery model. The results reveal low-dimensional structure and sensitivity metrics that a designer may exploit to study the relationship between parameters and predictions.

MLApr 3, 2022
Bi-fidelity Modeling of Uncertain and Partially Unknown Systems using DeepONets

Subhayan De, Matthew Reynolds, Malik Hassanaly et al.

Recent advances in modeling large-scale complex physical systems have shifted research focuses towards data-driven techniques. However, generating datasets by simulating complex systems can require significant computational resources. Similarly, acquiring experimental datasets can prove difficult as well. For these systems, often computationally inexpensive, but in general inaccurate, models, known as the low-fidelity models, are available. In this paper, we propose a bi-fidelity modeling approach for complex physical systems, where we model the discrepancy between the true system's response and low-fidelity response in the presence of a small training dataset from the true system's response using a deep operator network (DeepONet), a neural network architecture suitable for approximating nonlinear operators. We apply the approach to model systems that have parametric uncertainty and are partially unknown. Three numerical examples are used to show the efficacy of the proposed approach to model uncertain and partially unknown complex physical systems.

COMP-PHAug 17, 2018
Bi-fidelity approximation for uncertainty quantification and sensitivity analysis of irradiated particle-laden turbulence

Hillary R. Fairbanks, Lluis Jofre, Gianluca Geraci et al.

Efficiently performing predictive studies of irradiated particle-laden turbulent flows has the potential of providing significant contributions towards better understanding and optimizing, for example, concentrated solar power systems. As there are many uncertainties inherent in such flows, uncertainty quantification is fundamental to improve the predictive capabilities of the numerical simulations. For large-scale, multi-physics problems exhibiting high-dimensional uncertainty, characterizing the stochastic solution presents a significant computational challenge as many methods require a large number of high-fidelity solves. This requirement results in the need for a possibly infeasible number of simulations when a typical converged high-fidelity simulation requires intensive computational resources. To reduce the cost of quantifying high-dimensional uncertainties, we investigate the application of a non-intrusive, bi-fidelity approximation to estimate statistics of quantities of interest associated with an irradiated particle-laden turbulent flow. This method relies on exploiting the low-rank structure of the solution to accelerate the stochastic sampling and approximation processes by means of cheaper-to-run, lower fidelity representations. The application of this bi-fidelity approximation results in accurate estimates of the QoI statistics while requiring a small number of high-fidelity model evaluations.

NAMay 19, 2016
Optimization via Separated Representations and the Canonical Tensor Decomposition

Matthew J Reynolds, Gregory Beylkin, Alireza Doostan

We introduce a new, quadratically convergent algorithm for finding maximum absolute value entries of tensors represented in the canonical format. The computational complexity of the algorithm is linear in the dimension of the tensor. We show how to use this algorithm to find global maxima of non-convex multivariate functions in separated form. We demonstrate the performance of the new algorithms on several examples.

NAJan 8, 2019
Level Set Methods for Stochastic Discontinuity Detection in Nonlinear Problems

Per Pettersson, Alireza Doostan, Jan Nordström

Stochastic physical problems governed by nonlinear conservation laws are challenging due to solution discontinuities in stochastic and physical space. In this paper, we present a level set method to track discontinuities in stochastic space by solving a Hamilton-Jacobi equation. By introducing a speed function that vanishes at discontinuities, the iso-zero of the level set problem coincide with the discontinuities of the conservation law. The level set problem is solved on a sequence of successively finer grids in stochastic space. The method is adaptive in the sense that costly evaluations of the conservation law of interest are only performed in the vicinity of the discontinuities during the refinement stage. In regions of stochastic space where the solution is smooth, a surrogate method replaces expensive evaluations of the conservation law. The proposed method is tested in conjunction with different sets of localized orthogonal basis functions on simplex elements, as well as frames based on piecewise polynomials conforming to the level set function. The performance of the proposed method is compared to existing adaptive multi-element generalized polynomial chaos methods.

LGNov 9, 2022
QuadConv: Quadrature-Based Convolutions with Applications to Non-Uniform PDE Data Compression

Kevin Doherty, Cooper Simpson, Stephen Becker et al.

We present a new convolution layer for deep learning architectures which we call QuadConv -- an approximation to continuous convolution via quadrature. Our operator is developed explicitly for use on non-uniform, mesh-based data, and accomplishes this by learning a continuous kernel that can be sampled at arbitrary locations. Moreover, the construction of our operator admits an efficient implementation which we detail and construct. As an experimental validation of our operator, we consider the task of compressing partial differential equation (PDE) simulation data from fixed meshes. We show that QuadConv can match the performance of standard discrete convolutions on uniform grid data by comparing a QuadConv autoencoder (QCAE) to a standard convolutional autoencoder (CAE). Further, we show that the QCAE can maintain this accuracy even on non-uniform data. In both cases, QuadConv also outperforms alternative unstructured convolution methods such as graph convolution.

LGJun 22, 2023
In Situ Framework for Coupling Simulation and Machine Learning with Application to CFD

Riccardo Balin, Filippo Simini, Cooper Simpson et al.

Recent years have seen many successful applications of machine learning (ML) to facilitate fluid dynamic computations. As simulations grow, generating new training datasets for traditional offline learning creates I/O and storage bottlenecks. Additionally, performing inference at runtime requires non-trivial coupling of ML framework libraries with simulation codes. This work offers a solution to both limitations by simplifying this coupling and enabling in situ training and inference workflows on heterogeneous clusters. Leveraging SmartSim, the presented framework deploys a database to store data and ML models in memory, thus circumventing the file system. On the Polaris supercomputer, we demonstrate perfect scaling efficiency to the full machine size of the data transfer and inference costs thanks to a novel co-located deployment of the database. Moreover, we train an autoencoder in situ from a turbulent flow simulation, showing that the framework overhead is negligible relative to a solver time step and training epoch.

NADec 16, 2025
Boundary condition enforcement with PINNs: a comparative study and verification on 3D geometries

Conor Rowan, Kai Hampleman, Kurt Maute et al.

Since their advent nearly a decade ago, physics-informed neural networks (PINNs) have been studied extensively as a novel technique for solving forward and inverse problems in physics and engineering. The neural network discretization of the solution field is naturally adaptive and avoids meshing the computational domain, which can both improve the accuracy of the numerical solution and streamline implementation. However, there have been limited studies of PINNs on complex three-dimensional geometries, as the lack of mesh and the reliance on the strong form of the partial differential equation (PDE) make boundary condition (BC) enforcement challenging. Techniques to enforce BCs with PINNs have proliferated in the literature, but a comprehensive side-by-side comparison of these techniques and a study of their efficacy on geometrically complex three-dimensional test problems are lacking. In this work, we i) systematically compare BC enforcement techniques for PINNs, ii) propose a general solution framework for arbitrary three-dimensional geometries, and iii) verify the methodology on three-dimensional, linear and nonlinear test problems with combinations of Dirichlet, Neumann, and Robin boundaries. Our approach is agnostic to the underlying PDE, the geometry of the computational domain, and the nature of the BCs, while requiring minimal hyperparameter tuning. This work represents a step in the direction of establishing PINNs as a mature numerical method, capable of competing head-to-head with incumbents such as the finite element method.

OCFeb 13
Learning Gradient Flow: Using Equation Discovery to Accelerate Engineering Optimization

Grant Norman, Conor Rowan, Kurt Maute et al.

In this work, we investigate the use of data-driven equation discovery for dynamical systems to model and forecast continuous-time dynamics of unconstrained optimization problems. To avoid expensive evaluations of the objective function and its gradient, we leverage trajectory data on the optimization variables to learn the continuous-time dynamics associated with gradient descent, Newton's method, and ADAM optimization. The discovered gradient flows are then solved as a surrogate for the original optimization problem. To this end, we introduce the Learned Gradient Flow (LGF) optimizer, which is equipped to build surrogate models of variable polynomial order in full- or reduced-dimensional spaces at user-defined intervals in the optimization process. We demonstrate the efficacy of this approach on several standard problems from engineering mechanics and scientific machine learning, including two inverse problems, structural topology optimization, and two forward solves with different discretizations. Our results suggest that the learned gradient flows can significantly expedite convergence by capturing critical features of the optimization trajectory while avoiding expensive evaluations of the objective and its gradient.

SYApr 27
Risk-Aware Aerocapture Guidance Through a Probabilistic Indicator Function

Grace E. Calkins, Jay W. McMahon, Alireza Doostan et al.

Aerocapture is sensitive to trajectory errors, particularly for low-cost missions with imprecise navigation. For such missions, considering the probability of each failure mode when computing guidance commands can increase capture rate. A risk-aware aerocapture guidance algorithm is proposed that uses a generative model-based probabilistic indicator function to estimate escape, impact, or capture probabilities. The probability of each mode is incorporated into corrective guidance commands to increase the likelihood of successful capture. The proposed method is evaluated against state-of-the-art numeric predictor-corrector guidance algorithms in high-uncertainty scenarios where entry interface dispersions lead to nontrivial failure probabilities. When using a probabilistic indicator function in guidance, 71.43% to 100% of recoverable cases are saved for a variety of initial distributions and atmosphere models. The probabilistic indicator function is capable of predicting failure probability for dispersions and atmosphere models outside its training data, showing generalizability. In addition, the probabilistic indicator is compared to a fading memory filter for density estimation, demonstrating improvements in accuracy when both are used in conjunction. The proposed risk-aware aerocapture guidance algorithm improves capture performance and robustness to entry interface state dispersions, especially for missions with high navigation uncertainty.

LGDec 28, 2023Code
PINN surrogate of Li-ion battery models for parameter inference. Part I: Implementation and multi-fidelity hierarchies for the single-particle model

Malik Hassanaly, Peter J. Weddle, Ryan N. King et al.

To plan and optimize energy storage demands that account for Li-ion battery aging dynamics, techniques need to be developed to diagnose battery internal states accurately and rapidly. This study seeks to reduce the computational resources needed to determine a battery's internal states by replacing physics-based Li-ion battery models -- such as the single-particle model (SPM) and the pseudo-2D (P2D) model -- with a physics-informed neural network (PINN) surrogate. The surrogate model makes high-throughput techniques, such as Bayesian calibration, tractable to determine battery internal parameters from voltage responses. This manuscript is the first of a two-part series that introduces PINN surrogates of Li-ion battery models for parameter inference (i.e., state-of-health diagnostics). In this first part, a method is presented for constructing a PINN surrogate of the SPM. A multi-fidelity hierarchical training, where several neural nets are trained with multiple physics-loss fidelities is shown to significantly improve the surrogate accuracy when only training on the governing equation residuals. The implementation is made available in a companion repository (https://github.com/NREL/pinnstripes). The techniques used to develop a PINN surrogate of the SPM are extended in Part II for the PINN surrogate for the P2D battery model, and explore the Bayesian calibration capabilities of both surrogates.

NASep 30, 2024
Ensemble WSINDy for Data Driven Discovery of Governing Equations from Laser-based Full-field Measurements

Abigail C. Schmid, Alireza Doostan, Fatemeh Pourahmadian

This work leverages laser vibrometry and the weak form of the sparse identification of nonlinear dynamics (WSINDy) for partial differential equations to learn macroscale governing equations from full-field experimental data. In the experiments, two beam-like specimens, one aluminum and one IDOX/Estane composite, are subjected to shear wave excitation in the low frequency regime and the response is measured in the form of particle velocity on the specimen surface. The WSINDy for PDEs algorithm is applied to the resulting spatio-temporal data to discover the effective dynamics of the specimens from a family of potential PDEs. The discovered PDE is of the recognizable Euler-Bernoulli beam model form, from which the Young's modulus for the two materials are estimated. An ensemble version of the WSINDy algorithm is also used which results in information about the uncertainty in the PDE coefficients and Young's moduli. The discovered PDEs are also simulated with a finite element code to compare against the experimental data with reasonable accuracy. Using full-field experimental data and WSINDy together is a powerful non-destructive approach for learning unknown governing equations and gaining insights about mechanical systems in the dynamic regime.

LGApr 22
On the definition and importance of interpretability in scientific machine learning

Conor Rowan, Alireza Doostan

Though neural networks trained on large datasets have been successfully used to describe and predict many physical phenomena, there is a sense among scientists that, unlike traditional scientific models comprising simple mathematical expressions, their findings cannot be integrated into the body of scientific knowledge. Critics of machine learning's inability to produce human-understandable relationships have converged on the concept of "interpretability" as its point of departure from more traditional forms of science. As the growing interest in interpretability has shown, researchers in the physical sciences seek not just predictive models, but also to uncover the fundamental principles that govern a system of interest. However, clarity around a definition of interpretability and the precise role that it plays in science is lacking in the literature. In this work, we argue that researchers in equation discovery and symbolic regression tend to conflate the concept of sparsity with interpretability. We review key papers on interpretable machine learning from outside the scientific community and argue that, though the definitions and methods they propose can inform questions of interpretability for scientific machine learning (SciML), they are inadequate for this new purpose. Noting these deficiencies, we propose an operational definition of interpretability for the physical sciences. Our notion of interpretability emphasizes understanding of the mechanism over mathematical sparsity. Innocuous though it may seem, this emphasis on mechanism shows that sparsity is often unnecessary. It also questions the possibility of interpretable scientific discovery when prior knowledge is lacking. We believe a precise and philosophically informed definition of interpretability in SciML will help focus research efforts toward the most significant obstacles to realizing a data-driven scientific future.

CEMar 17
Integration of local and global surrogates for failure probability estimation

Audrey Gaymann, Juan M. Cardenas, Sung Min Jo et al.

This paper presents the development of an algorithm, termed the Global-Local Hybrid Surrogate (GLHS), designed to efficiently compute the probability of rare failure events in complex systems. The primary goal is to enhance the accuracy of reliability analysis while minimizing computational cost, particularly for high-dimensional problems where traditional methods, such as Monte Carlo simulations, become prohibitively expensive. The proposed GLHS builds upon the foundational work of Li et al., by integrating an adaptive strategy based on the General Domain Adaptive Strategy (Adcock et al.). The algorithm aims to approximate the failure domain of a given system, defined as the region in the input domain where the system transitions from safe to failure modes, described by a limit state surface. This failure domain is not explicitly known and must be learned iteratively during the analysis. The method employs a buffer zone, defined as the region surrounding the limit state surface. Within this buffer zone, Christoffel Adaptive Sampling is utilized to select new samples for constructing localized surrogate models, which are designed to refine the approximation in regions critical to failure probability estimation. The iterative process proceeds until convergence is reached. This results in a hybrid methodology that integrates a global surrogate to capture the overall trend with local surrogates that concentrate on critical regions near the limit state function. By adopting this strategy, the GLHS method balances computational efficiency with accuracy in estimating the failure probability.

LGNov 4, 2025
In Situ Training of Implicit Neural Compressors for Scientific Simulations via Sketch-Based Regularization

Cooper Simpson, Stephen Becker, Alireza Doostan

Focusing on implicit neural representations, we present a novel in situ training protocol that employs limited memory buffers of full and sketched data samples, where the sketched data are leveraged to prevent catastrophic forgetting. The theoretical motivation for our use of sketching as a regularizer is presented via a simple Johnson-Lindenstrauss-informed result. While our methods may be of wider interest in the field of continual learning, we specifically target in situ neural compression using implicit neural representation-based hypernetworks. We evaluate our method on a variety of complex simulation data in two and three dimensions, over long time horizons, and across unstructured grids and non-Cartesian geometries. On these tasks, we show strong reconstruction performance at high compression rates. Most importantly, we demonstrate that sketching enables the presented in situ scheme to approximately match the performance of the equivalent offline method.

LGDec 28, 2023
PINN surrogate of Li-ion battery models for parameter inference. Part II: Regularization and application of the pseudo-2D model

Malik Hassanaly, Peter J. Weddle, Ryan N. King et al.

Bayesian parameter inference is useful to improve Li-ion battery diagnostics and can help formulate battery aging models. However, it is computationally intensive and cannot be easily repeated for multiple cycles, multiple operating conditions, or multiple replicate cells. To reduce the computational cost of Bayesian calibration, numerical solvers for physics-based models can be replaced with faster surrogates. A physics-informed neural network (PINN) is developed as a surrogate for the pseudo-2D (P2D) battery model calibration. For the P2D surrogate, additional training regularization was needed as compared to the PINN single-particle model (SPM) developed in Part I. Both the PINN SPM and P2D surrogate models are exercised for parameter inference and compared to data obtained from a direct numerical solution of the governing equations. A parameter inference study highlights the ability to use these PINNs to calibrate scaling parameters for the cathode Li diffusion and the anode exchange current density. By realizing computational speed-ups of 2250x for the P2D model, as compared to using standard integrating methods, the PINN surrogates enable rapid state-of-health diagnostics. In the low-data availability scenario, the testing error was estimated to 2mV for the SPM surrogate and 10mV for the P2D surrogate which could be mitigated with additional data.

NAJun 4, 2025
Solving engineering eigenvalue problems with neural networks using the Rayleigh quotient

Conor Rowan, John Evans, Kurt Maute et al.

From characterizing the speed of a thermal system's response to computing natural modes of vibration, eigenvalue analysis is ubiquitous in engineering. In spite of this, eigenvalue problems have received relatively little treatment compared to standard forward and inverse problems in the physics-informed machine learning literature. In particular, neural network discretizations of solutions to eigenvalue problems have seen only a handful of studies. Owing to their nonlinearity, neural network discretizations prevent the conversion of the continuous eigenvalue differential equation into a standard discrete eigenvalue problem. In this setting, eigenvalue analysis requires more specialized techniques. Using a neural network discretization of the eigenfunction, we show that a variational form of the eigenvalue problem called the "Rayleigh quotient" in tandem with a Gram-Schmidt orthogonalization procedure is a particularly simple and robust approach to find the eigenvalues and their corresponding eigenfunctions. This method is shown to be useful for finding sets of harmonic functions on irregular domains, parametric and nonlinear eigenproblems, and high-dimensional eigenanalysis. We also discuss the utility of harmonic functions as a spectral basis for approximating solutions to partial differential equations. Through various examples from engineering mechanics, the combination of the Rayleigh quotient objective, Gram-Schmidt procedure, and the neural network discretization of the eigenfunction is shown to offer unique advantages for handling continuous eigenvalue problems.

CEJan 31, 2025
Physically Interpretable Representation and Controlled Generation for Turbulence Data

Tiffany Fan, Murray Cutforth, Marta D'Elia et al.

Computational Fluid Dynamics (CFD) plays a pivotal role in fluid mechanics, enabling precise simulations of fluid behavior through partial differential equations (PDEs). However, traditional CFD methods are resource-intensive, particularly for high-fidelity simulations of complex flows, which are further complicated by high dimensionality, inherent stochasticity, and limited data availability. This paper addresses these challenges by proposing a data-driven approach that leverages a Gaussian Mixture Variational Autoencoder (GMVAE) to encode high-dimensional scientific data into low-dimensional, physically meaningful representations. The GMVAE learns a structured latent space where data can be categorized based on physical properties such as the Reynolds number while maintaining global physical consistency. To assess the interpretability of the learned representations, we introduce a novel metric based on graph spectral theory, quantifying the smoothness of physical quantities along the latent manifold. We validate our approach using 2D Navier-Stokes simulations of flow past a cylinder over a range of Reynolds numbers. Our results demonstrate that the GMVAE provides improved clustering, meaningful latent structure, and robust generative capabilities compared to baseline dimensionality reduction methods. This framework offers a promising direction for data-driven turbulence modeling and broader applications in computational fluid dynamics and engineering systems.

LGNov 26, 2025
Physically Interpretable Representation Learning with Gaussian Mixture Variational AutoEncoder (GM-VAE)

Tiffany Fan, Murray Cutforth, Marta D'Elia et al.

Extracting compact, physically interpretable representations from high-dimensional scientific data is a persistent challenge due to the complex, nonlinear structures inherent in physical systems. We propose a Gaussian Mixture Variational Autoencoder (GM-VAE) framework designed to address this by integrating an Expectation-Maximization (EM)-inspired training scheme with a novel spectral interpretability metric. Unlike conventional VAEs that jointly optimize reconstruction and clustering (often leading to training instability), our method utilizes a block-coordinate descent strategy, alternating between expectation and maximization steps. This approach stabilizes training and naturally aligns latent clusters with distinct physical regimes. To objectively evaluate the learned representations, we introduce a quantitative metric based on graph-Laplacian smoothness, which measures the coherence of physical quantities across the latent manifold. We demonstrate the efficacy of this framework on datasets of increasing complexity: surface reaction ODEs, Navier-Stokes wake flows, and experimental laser-induced combustion Schlieren images. The results show that our GM-VAE yields smooth, physically consistent manifolds and accurate regime clustering, offering a robust data-driven tool for interpreting turbulent and reactive flow systems.

CEMay 8, 2025
Physics-informed solution reconstruction in elasticity and heat transfer using the explicit constraint force method

Conor Rowan, Kurt Maute, Alireza Doostan

One use case of ``physics-informed neural networks'' (PINNs) is solution reconstruction, which aims to estimate the full-field state of a physical system from sparse measurements. Parameterized governing equations of the system are used in tandem with the measurements to regularize the regression problem. However, in real-world solution reconstruction problems, the parameterized governing equation may be inconsistent with the physical phenomena that give rise to the measurement data. We show that due to assuming consistency between the true and parameterized physics, PINNs-based approaches may fail to satisfy three basic criteria of interpretability, robustness, and data consistency. As we argue, these criteria ensure that (i) the quality of the reconstruction can be assessed, (ii) the reconstruction does not depend strongly on the choice of physics loss, and (iii) that in certain situations, the physics parameters can be uniquely recovered. In the context of elasticity and heat transfer, we demonstrate how standard formulations of the physics loss and techniques for constraining the solution to respect the measurement data lead to different ``constraint forces" -- which we define as additional source terms arising from the constraints -- and that these constraint forces can significantly influence the reconstructed solution. To avoid the potentially substantial influence of the choice of physics loss and method of constraint enforcement on the reconstructed solution, we propose the ``explicit constraint force method'' (ECFM) to gain control of the source term introduced by the constraint. We then show that by satisfying the criteria of interpretability, robustness, and data consistency, this approach leads to more predictable and customizable reconstructions from noisy measurement data, even when the parameterization of the missing physics is inconsistent with the measured system.

LGApr 30, 2025
Stochastic Subspace Descent Accelerated via Bi-fidelity Line Search

Nuojin Cheng, Alireza Doostan, Stephen Becker

Efficient optimization remains a fundamental challenge across numerous scientific and engineering domains, especially when objective function and gradient evaluations are computationally expensive. While zeroth-order optimization methods offer effective approaches when gradients are inaccessible, their practical performance can be limited by the high cost associated with function queries. This work introduces the bi-fidelity stochastic subspace descent (BF-SSD) algorithm, a novel zeroth-order optimization method designed to reduce this computational burden. BF-SSD leverages a bi-fidelity framework, constructing a surrogate model from a combination of computationally inexpensive low-fidelity (LF) and accurate high-fidelity (HF) function evaluations. This surrogate model facilitates an efficient backtracking line search for step size selection, for which we provide theoretical convergence guarantees under standard assumptions. We perform a comprehensive empirical evaluation of BF-SSD across four distinct problems: a synthetic optimization benchmark, dual-form kernel ridge regression, black-box adversarial attacks on machine learning models, and transformer-based black-box language model fine-tuning. Numerical results demonstrate that BF-SSD consistently achieves superior optimization performance while requiring significantly fewer HF function evaluations compared to relevant baseline methods. This study highlights the efficacy of integrating bi-fidelity strategies within zeroth-order optimization, positioning BF-SSD as a promising and computationally efficient approach for tackling large-scale, high-dimensional problems encountered in various real-world applications.

MLMay 25, 2023
Bi-fidelity Variational Auto-encoder for Uncertainty Quantification

Nuojin Cheng, Osman Asif Malik, Subhayan De et al.

Quantifying the uncertainty of quantities of interest (QoIs) from physical systems is a primary objective in model validation. However, achieving this goal entails balancing the need for computational efficiency with the requirement for numerical accuracy. To address this trade-off, we propose a novel bi-fidelity formulation of variational auto-encoders (BF-VAE) designed to estimate the uncertainty associated with a QoI from low-fidelity (LF) and high-fidelity (HF) samples of the QoI. This model allows for the approximation of the statistics of the HF QoI by leveraging information derived from its LF counterpart. Specifically, we design a bi-fidelity auto-regressive model in the latent space that is integrated within the VAE's probabilistic encoder-decoder structure. An effective algorithm is proposed to maximize the variational lower bound of the HF log-likelihood in the presence of limited HF data, resulting in the synthesis of HF realizations with a reduced computational cost. Additionally, we introduce the concept of the bi-fidelity information bottleneck (BF-IB) to provide an information-theoretic interpretation of the proposed BF-VAE model. Our numerical results demonstrate that BF-VAE leads to considerably improved accuracy, as compared to a VAE trained using only HF data, when limited HF data is available.

CVFeb 2, 2022
Automated processing of X-ray computed tomography images via panoptic segmentation for modeling woven composite textiles

Aaron Allred, Lauren J. Abbott, Alireza Doostan et al.

A new, machine learning-based approach for automatically generating 3D digital geometries of woven composite textiles is proposed to overcome the limitations of existing analytical descriptions and segmentation methods. In this approach, panoptic segmentation is leveraged to produce instance segmented semantic masks from X-ray computed tomography (CT) images. This effort represents the first deep learning based automated process for segmenting unique yarn instances in a woven composite textile. Furthermore, it improves on existing methods by providing instance-level segmentation on low contrast CT datasets. Frame-to-frame instance tracking is accomplished via an intersection-over-union (IoU) approach adopted from video panoptic segmentation for assembling a 3D geometric model. A corrective recognition algorithm is developed to improve the recognition quality (RQ). The panoptic quality (PQ) metric is adopted to provide a new universal evaluation metric for reconstructed woven composite textiles. It is found that the panoptic segmentation network generalizes well to new CT images that are similar to the training set but does not extrapolate well to CT images of differing geometry, texture, and contrast. The utility of this approach is demonstrated by capturing yarn flow directions, contact regions between individual yarns, and the spatially varying cross-sectional areas of the yarns.

MLJan 31, 2022
GenMod: A generative modeling approach for spectral representation of PDEs with random inputs

Jacqueline Wentz, Alireza Doostan

We propose a method for quantifying uncertainty in high-dimensional PDE systems with random parameters, where the number of solution evaluations is small. Parametric PDE solutions are often approximated using a spectral decomposition based on polynomial chaos expansions. For the class of systems we consider (i.e., high dimensional with limited solution evaluations) the coefficients are given by an underdetermined linear system in a regression formulation. This implies additional assumptions, such as sparsity of the coefficient vector, are needed to approximate the solution. Here, we present an approach where we assume the coefficients are close to the range of a generative model that maps from a low to a high dimensional space of coefficients. Our approach is inspired be recent work examining how generative models can be used for compressed sensing in systems with random Gaussian measurement matrices. Using results from PDE theory on coefficient decay rates, we construct an explicit generative model that predicts the polynomial chaos coefficient magnitudes. The algorithm we developed to find the coefficients, which we call GenMod, is composed of two main steps. First, we predict the coefficient signs using Orthogonal Matching Pursuit. Then, we assume the coefficients are within a sparse deviation from the range of a sign-adjusted generative model. This allows us to find the coefficients by solving a nonconvex optimization problem, over the input space of the generative model and the space of sparse vectors. We obtain theoretical recovery results for a Lipschitz continuous generative model and for a more specific generative model, based on coefficient decay rate bounds. We examine three high-dimensional problems and show that, for all three examples, the generative model approach outperforms sparsity promoting methods at small sample sizes.

MLJan 29, 2022
A Priori Denoising Strategies for Sparse Identification of Nonlinear Dynamical Systems: A Comparative Study

Alexandre Cortiella, Kwang-Chun Park, Alireza Doostan

In recent years, identification of nonlinear dynamical systems from data has become increasingly popular. Sparse regression approaches, such as Sparse Identification of Nonlinear Dynamics (SINDy), fostered the development of novel governing equation identification algorithms assuming the state variables are known a priori and the governing equations lend themselves to sparse, linear expansions in a (nonlinear) basis of the state variables. In the context of the identification of governing equations of nonlinear dynamical systems, one faces the problem of identifiability of model parameters when state measurements are corrupted by noise. Measurement noise affects the stability of the recovery process yielding incorrect sparsity patterns and inaccurate estimation of coefficients of the governing equations. In this work, we investigate and compare the performance of several local and global smoothing techniques to a priori denoise the state measurements and numerically estimate the state time-derivatives to improve the accuracy and robustness of two sparse regression methods to recover governing equations: Sequentially Thresholded Least Squares (STLS) and Weighted Basis Pursuit Denoising (WBPDN) algorithms. We empirically show that, in general, global methods, which use the entire measurement data set, outperform local methods, which employ a neighboring data subset around a local point. We additionally compare Generalized Cross Validation (GCV) and Pareto curve criteria as model selection techniques to automatically estimate near optimal tuning parameters, and conclude that Pareto curves yield better results. The performance of the denoising strategies and sparse regression methods is empirically evaluated through well-known benchmark problems of nonlinear dynamical systems.

MLMay 27, 2021
Neural Network Training Using $\ell_1$-Regularization and Bi-fidelity Data

Subhayan De, Alireza Doostan

With the capability of accurately representing a functional relationship between the inputs of a physical system's model and output quantities of interest, neural networks have become popular for surrogate modeling in scientific applications. However, as these networks are over-parameterized, their training often requires a large amount of data. To prevent overfitting and improve generalization error, regularization based on, e.g., $\ell_1$- and $\ell_2$-norms of the parameters is applied. Similarly, multiple connections of the network may be pruned to increase sparsity in the network parameters. In this paper, we explore the effects of sparsity promoting $\ell_1$-regularization on training neural networks when only a small training dataset from a high-fidelity model is available. As opposed to standard $\ell_1$-regularization that is known to be inadequate, we consider two variants of $\ell_1$-regularization informed by the parameters of an identical network trained using data from lower-fidelity models of the problem at hand. These bi-fidelity strategies are generalizations of transfer learning of neural networks that uses the parameters learned from a large low-fidelity dataset to efficiently train networks for a small high-fidelity dataset. We also compare the bi-fidelity strategies with two $\ell_1$-regularization methods that only use the high-fidelity dataset. Three numerical examples for propagating uncertainty through physical systems are used to show that the proposed bi-fidelity $\ell_1$-regularization strategies produce errors that are one order of magnitude smaller than those of networks trained only using datasets from the high-fidelity models.

SPMar 30, 2021
Prediction of Ultrasonic Guided Wave Propagation in Solid-fluid and their Interface under Uncertainty using Machine Learning

Subhayan De, Bhuiyan Shameem Mahmood Ebna Hai, Alireza Doostan et al.

Structural health monitoring (SHM) systems use the non-destructive testing principle for damage identification. As part of SHM, the propagation of ultrasonic guided waves (UGWs) is tracked and analyzed for the changes in the associated wave pattern. These changes help identify the location of a structural damage, if any. We advance existing research by accounting for uncertainty in the material and geometric properties of a structure. The physics model used in this study comprises of a monolithically coupled system of acoustic and elastic wave equations, known as the wave propagation in fluid-solid and their interface (WpFSI) problem. As the UGWs propagate in the solid, fluid, and their interface, the wave signal displacement measurements are contrasted against the benchmark pattern. For the numerical solution, we develop an efficient algorithm that successfully addresses the inherent complexity of solving the multiphysics problem under uncertainty. We present a procedure that uses Gaussian process regression and convolutional neural network for predicting the UGW propagation in a solid-fluid and their interface under uncertainty. First, a set of training images for different realizations of the uncertain parameters of the inclusion inside the structure is generated using a monolithically-coupled system of acoustic and elastic wave equations. Next, Gaussian processes trained with these images are used for predicting the propagated wave with convolutional neural networks for further enhancement to produce high-quality images of the wave patterns for new realizations of the uncertainty. The results indicate that the proposed approach provides an accurate prediction for the WpFSI problem in the presence of uncertainty.

MLMay 27, 2020
Sparse Identification of Nonlinear Dynamical Systems via Reweighted $\ell_1$-regularized Least Squares

Alexandre Cortiella, Kwang-Chun Park, Alireza Doostan

This work proposes an iterative sparse-regularized regression method to recover governing equations of nonlinear dynamical systems from noisy state measurements. The method is inspired by the Sparse Identification of Nonlinear Dynamics (SINDy) approach of {\it [Brunton et al., PNAS, 113 (15) (2016) 3932-3937]}, which relies on two main assumptions: the state variables are known {\it a priori} and the governing equations lend themselves to sparse, linear expansions in a (nonlinear) basis of the state variables. The aim of this work is to improve the accuracy and robustness of SINDy in the presence of state measurement noise. To this end, a reweighted $\ell_1$-regularized least squares solver is developed, wherein the regularization parameter is selected from the corner point of a Pareto curve. The idea behind using weighted $\ell_1$-norm for regularization -- instead of the standard $\ell_1$-norm -- is to better promote sparsity in the recovery of the governing equations and, in turn, mitigate the effect of noise in the state variables. We also present a method to recover single physical constraints from state measurements. Through several examples of well-known nonlinear dynamical systems, we demonstrate empirically the accuracy and robustness of the reweighted $\ell_1$-regularized least squares strategy with respect to state measurement noise, thus illustrating its viability for a wide range of potential applications.

MLFeb 11, 2020
On transfer learning of neural networks using bi-fidelity data for uncertainty propagation

Subhayan De, Jolene Britton, Matthew Reynolds et al.

Due to their high degree of expressiveness, neural networks have recently been used as surrogate models for mapping inputs of an engineering system to outputs of interest. Once trained, neural networks are computationally inexpensive to evaluate and remove the need for repeated evaluations of computationally expensive models in uncertainty quantification applications. However, given the highly parameterized construction of neural networks, especially deep neural networks, accurate training often requires large amounts of simulation data that may not be available in the case of computationally expensive systems. In this paper, to alleviate this issue for uncertainty propagation, we explore the application of transfer learning techniques using training data generated from both high- and low-fidelity models. We explore two strategies for coupling these two datasets during the training procedure, namely, the standard transfer learning and the bi-fidelity weighted learning. In the former approach, a neural network model mapping the inputs to the outputs of interest is trained based on the low-fidelity data. The high-fidelity data is then used to adapt the parameters of the upper layer(s) of the low-fidelity network, or train a simpler neural network to map the output of the low-fidelity network to that of the high-fidelity model. In the latter approach, the entire low-fidelity network parameters are updated using data generated via a Gaussian process model trained with a small high-fidelity dataset. The parameter updates are performed via a variant of stochastic gradient descent with learning rates given by the Gaussian process model. Using three numerical examples, we illustrate the utility of these bi-fidelity transfer learning methods where we focus on accuracy improvement achieved by transfer learning over standard training approaches.

NAApr 13, 2019
Physics-Informed Tolerance Allocation: A Surrogate-Based Framework for the Control of Geometric Variation on System Performance

Joseph Benzaken, Alireza Doostan, John A. Evans

In this paper, we present a novel tolerance allocation algorithm for the assessment and control of geometric variation on system performance that is applicable to any system of partial differential equations. In particular, we parameterize the geometric domain of the system in terms of design parameters and subsequently measure the effect of design parameter variation on system performance. A surrogate model via a tensor representation is constructed to map the design parameter variation to the system performance. A set of optimization problems over this surrogate model restricted to nested hyperrectangles represents the effect of prescribing design tolerances, where the maximizer of this restricted function depicts the worst-case member, i.e. the worst-case design. Moreover, the loci of these tolerance hyperrectangles with maximizers attaining, but not surpassing, the performance constraint represents the boundary to the feasible region of allocatable tolerances. Every tolerance in this domain is measured through a user-specified, weighted norm which is informed by design considerations such as cost and manufacturability. The boundary of the feasible set is elucidated as an immersed manifold of codimension one, over which a suite of optimization routines exist and are employed to efficiently determine an optimal feasible tolerance with respect to the specified measure. Examples of this algorithm are presented with applications to a plate with a hole described by two design parameters, a plate with a hole described by six design parameters, and an L-Bracket described by seventeen design parameters.

NASep 12, 2017
Parametric/Stochastic Model Reduction: Low-Rank Representation, Non-Intrusive Bi-Fidelity Approximation, and Convergence Analysis

Jerrad Hampton, Hillary Fairbanks, Akil Narayan et al.

For practical model-based demands, such as design space exploration and uncertainty quantification (UQ), a high-fidelity model that produces accurate outputs often has high computational cost, while a low-fidelity model with less accurate outputs has low computational cost. It is often possible to construct a bi-fidelity model having accuracy comparable with the high-fidelity model and computational cost comparable with the low-fidelity model. This work presents the construction and analysis of a non-intrusive (i.e., sample-based) bi-fidelity model that relies on the low-rank structure of the map between model parameters/uncertain inputs and the solution of interest, if exists. Specifically, we derive a novel, pragmatic estimate for the error committed by this bi-fidelity model. We show that this error bound can be used to determine if a given pair of low- and high-fidelity models will lead to an accurate bi-fidelity approximation. The cost of this error bound is relatively small and depends on the solution rank. The value of this error estimate is demonstrated using two example problems in the context of UQ, involving linear and non-linear partial differential equations.