NAJun 21, 2022
Derivative-Informed Neural Operator: An Efficient Framework for High-Dimensional Parametric Derivative LearningThomas O'Leary-Roseberry, Peng Chen, Umberto Villa et al.
We propose derivative-informed neural operators (DINOs), a general family of neural networks to approximate operators as infinite-dimensional mappings from input function spaces to output function spaces or quantities of interest. After discretizations both inputs and outputs are high-dimensional. We aim to approximate not only the operators with improved accuracy but also their derivatives (Jacobians) with respect to the input function-valued parameter to empower derivative-based algorithms in many applications, e.g., Bayesian inverse problems, optimization under parameter uncertainty, and optimal experimental design. The major difficulties include the computational cost of generating derivative training data and the high dimensionality of the problem leading to large training cost. To address these challenges, we exploit the intrinsic low-dimensionality of the derivatives and develop algorithms for compressing derivative information and efficiently imposing it in neural operator training yielding derivative-informed neural operators. We demonstrate that these advances can significantly reduce the costs of both data generation and training for large classes of problems (e.g., nonlinear steady state parametric PDE maps), making the costs marginal or comparable to the costs without using derivatives, and in particular independent of the discretization dimension of the input and output functions. Moreover, we show that the proposed DINO achieves significantly higher accuracy than neural operators trained without derivative information, for both function approximation and derivative approximation (e.g., Gauss-Newton Hessian), especially when the training data are limited.
NAMar 24, 2017
A Multilevel, Hierarchical Sampling Technique for Spatially Correlated Random FieldsSarah Osborn, Panayot Vassilevski, Umberto Villa
We propose an alternative method to generate samples of a spatially correlated random field with applications to large-scale problems for forward propagation of uncertainty. A classical approach for generating these samples is the Karhunen-Loève (KL) decomposition. However, the KL expansion requires solving a dense eigenvalue problem and is therefore computationally infeasible for large-scale problems. Sampling methods based on stochastic partial differential equations provide a highly scalable way to sample Gaussian fields, but the resulting parametrization is mesh dependent. We propose a multilevel decomposition of the stochastic field to allow for scalable, hierarchical sampling based on solving a mixed finite element formulation of a stochastic reaction-diffusion equation with a random, white noise source function. Numerical experiments are presented to demonstrate the scalability of the sampling method as well as numerical results of multilevel Monte Carlo simulations for a subsurface porous media flow application using the proposed sampling method.
SPApr 2, 2023
Ideal Observer Computation by Use of Markov-Chain Monte Carlo with Generative Adversarial NetworksWeimin Zhou, Umberto Villa, Mark A. Anastasio
Medical imaging systems are often evaluated and optimized via objective, or task-specific, measures of image quality (IQ) that quantify the performance of an observer on a specific clinically-relevant task. The performance of the Bayesian Ideal Observer (IO) sets an upper limit among all observers, numerical or human, and has been advocated for use as a figure-of-merit (FOM) for evaluating and optimizing medical imaging systems. However, the IO test statistic corresponds to the likelihood ratio that is intractable to compute in the majority of cases. A sampling-based method that employs Markov-Chain Monte Carlo (MCMC) techniques was previously proposed to estimate the IO performance. However, current applications of MCMC methods for IO approximation have been limited to a small number of situations where the considered distribution of to-be-imaged objects can be described by a relatively simple stochastic object model (SOM). As such, there remains an important need to extend the domain of applicability of MCMC methods to address a large variety of scenarios where IO-based assessments are needed but the associated SOMs have not been available. In this study, a novel MCMC method that employs a generative adversarial network (GAN)-based SOM, referred to as MCMC-GAN, is described and evaluated. The MCMC-GAN method was quantitatively validated by use of test-cases for which reference solutions were available. The results demonstrate that the MCMC-GAN method can extend the domain of applicability of MCMC methods for conducting IO analyses of medical imaging systems.
NADec 19, 2017
Scalable hierarchical PDE sampler for generating spatially correlated random fields using non-matching meshesSarah Osborn, Patrick Zulian, Thomas Benson et al.
This work describes a domain embedding technique between two non-matching meshes used for generating realizations of spatially correlated random fields with applications to large-scale sampling-based uncertainty quantification. The goal is to apply the multilevel Monte Carlo (MLMC) method for the quantification of output uncertainties of PDEs with random input coefficients on general, unstructured computational domains. We propose a highly scalable, hierarchical sampling method to generate realizations of a Gaussian random field on a given unstructured mesh by solving a reaction-diffusion PDE with a stochastic right-hand side. The stochastic PDE is discretized using the mixed finite element method on an embedded domain with a structured mesh, and then the solution is projected onto the unstructured mesh. This work describes implementation details on how to efficiently transfer data from the structured and unstructured meshes at coarse levels, assuming this can be done efficiently on the finest level. We investigate the efficiency and parallel scalability of the technique for the scalable generation of Gaussian random fields in three dimensions. An application of the MLMC method is presented for quantifying uncertainties of subsurface flow problems. We demonstrate the scalability of the sampling method with non-matching mesh embedding, coupled with a parallel forward model problem solver, for large-scale 3D MLMC simulations with up to $1.9\cdot 10^9$ unknowns.
48.8LGApr 16
Survey of Deep Learning and Physics-Based Approaches in Computational Wave ImagingYouzuo Lin, Shihang Feng, James Theiler et al.
Computational wave imaging (CWI) extracts hidden structure and physical properties of a volume of material by analyzing wave signals that traverse that volume. Applications include seismic exploration of the Earth's subsurface, acoustic imaging and non-destructive testing in material science, and ultrasound computed tomography in medicine. Current approaches for solving CWI problems can be divided into two categories: those rooted in traditional physics, and those based on deep learning. Physics-based methods stand out for their ability to provide high-resolution and quantitatively accurate estimates of acoustic properties within the medium. However, they can be computationally intensive and are susceptible to ill-posedness and nonconvexity typical of CWI problems. Machine learning-based computational methods have recently emerged, offering a different perspective to address these challenges. Diverse scientific communities have independently pursued the integration of deep learning in CWI. This review discusses how contemporary scientific machine-learning (ML) techniques, and deep neural networks in particular, have been developed to enhance and integrate with traditional physics-based methods for solving CWI problems. We present a structured framework that consolidates existing research spanning multiple domains, including computational imaging, wave physics, and data science. This study concludes with important lessons learned from existing ML-based methods and identifies technical hurdles and emerging trends through a systematic analysis of the extensive literature on this topic.
MED-PHJan 23Code
Benchmarking Deep Learning-Based Reconstruction Methods for Photoacoustic Computed Tomography with Clinically Relevant Synthetic DatasetsPanpan Chen, Seonyeong Park, Gangwon Jeong et al.
Deep learning (DL)-based image reconstruction methods for photoacoustic computed tomography (PACT) have developed rapidly in recent years. However, most existing methods have not employed standardized datasets, and their evaluations rely on traditional image quality (IQ) metrics that may lack clinical relevance. The absence of a standardized framework for clinically meaningful IQ assessment hinders fair comparison and raises concerns about the reproducibility and reliability of reported advancements in PACT. A benchmarking framework is proposed that provides open-source, anatomically plausible synthetic datasets and evaluation strategies for DL-based acoustic inversion methods in PACT. The datasets each include over 11,000 two-dimensional (2D) stochastic breast objects with clinically relevant lesions and paired measurements at varying modeling complexity. The evaluation strategies incorporate both traditional and task-based IQ measures to assess fidelity and clinical utility. A preliminary benchmarking study is conducted to demonstrate the framework's utility by comparing DL-based and physics-based reconstruction methods. The benchmarking study demonstrated that the proposed framework enabled comprehensive, quantitative comparisons of reconstruction performance and revealed important limitations in certain DL-based methods. Although they performed well according to traditional IQ measures, they often failed to accurately recover lesions. This highlights the inadequacy of traditional metrics and motivates the need for task-based assessments. The proposed benchmarking framework enables systematic comparisons of DL-based acoustic inversion methods for 2D PACT. By integrating clinically relevant synthetic datasets with rigorous evaluation protocols, it enables reproducible, objective assessments and facilitates method development and system optimization in PACT.
NADec 5, 2017
An UQ-ready finite element solver for a two-dimensional RANS model of free plane jetsUmberto Villa, Alexandre Noll Marques
Numerical solution of the system of partial differential equations arising from the Reynolds-Averaged Navier-Stokes (RANS) equations with $k-ε$ turbulence model presents several challenges due to the advection dominated nature of the problem and the presence of highly nonlinear reaction terms. State-of-the-art software for the numerical solution of the RANS equations address these challenges by introducing non-differentiable perturbations in the model to ensure numerical stability. However, this approach leads to difficulties in the formulation of the higher-order forward/adjoint problems, which are needed for scalable Hessian-based uncertainty quantification (UQ) methods. In this note, we present the construction of a UQ-ready flow solver, i.e., one that is smoothly differentiable and provides not only forward solver capabilities but also adjoint and higher derivatives capabilities.
LGJan 28
A Learning-based Framework for Spatial Impulse Response Compensation in 3D Photoacoustic Computed TomographyKaiyi Yang, Seonyeong Park, Gangwon Jeong et al.
Photoacoustic computed tomography (PACT) is a promising imaging modality that combines the advantages of optical contrast with ultrasound detection. Utilizing ultrasound transducers with larger surface areas can improve detection sensitivity. However, when computationally efficient analytic reconstruction methods that neglect the spatial impulse responses (SIRs) of the transducer are employed, the spatial resolution of the reconstructed images will be compromised. Although optimization-based reconstruction methods can explicitly account for SIR effects, their computational cost is generally high, particularly in three-dimensional (3D) applications. To address the need for accurate but rapid 3D PACT image reconstruction, this study presents a framework for establishing a learned SIR compensation method that operates in the data domain. The learned compensation method maps SIR-corrupted PACT measurement data to compensated data that would have been recorded by idealized point-like transducers. Subsequently, the compensated data can be used with a computationally efficient reconstruction method that neglects SIR effects. Two variants of the learned compensation model are investigated that employ a U-Net model and a specifically designed, physics-inspired model, referred to as Deconv-Net. A fast and analytical training data generation procedure is also a component of the presented framework. The framework is rigorously validated in virtual imaging studies, demonstrating resolution improvement and robustness to noise variations, object complexity, and sound speed heterogeneity. When applied to in-vivo breast imaging data, the learned compensation models revealed fine structures that had been obscured by SIR-induced artifacts. To our knowledge, this is the first demonstration of learned SIR compensation in 3D PACT imaging.
OCMar 4, 2025
Benchmarking Diffusion Annealing-Based Bayesian Inverse Problem SolversEvan Scope Crafts, Umberto Villa
In recent years, the ascendance of diffusion modeling as a state-of-the-art generative modeling approach has spurred significant interest in their use as priors in Bayesian inverse problems. However, it is unclear how to optimally integrate a diffusion model trained on the prior distribution with a given likelihood function to obtain posterior samples. While algorithms developed for this purpose can produce high-quality, diverse point estimates of the unknown parameters of interest, they are often tested on problems where the prior distribution is analytically unknown, making it difficult to assess their performance in providing rigorous uncertainty quantification. Motivated by this challenge, this work introduces three benchmark problems for evaluating the performance of diffusion model based samplers. The benchmark problems, which are inspired by problems in image inpainting, x-ray tomography, and phase retrieval, have a posterior density that is analytically known. In this setting, approximate ground-truth posterior samples can be obtained, enabling principled evaluation of the performance of posterior sampling algorithms. This work also introduces a general framework for diffusion model based posterior sampling, Bayesian Inverse Problem Solvers through Diffusion Annealing (BIPSDA). This framework unifies several recently proposed diffusion-model-based posterior sampling algorithms and contains novel algorithms that can be realized through flexible combinations of design choices. We tested the performance of a set of BIPSDA algorithms, including previously proposed state-of-the-art approaches, on the proposed benchmark problems. The results provide insight into the strengths and limitations of existing diffusion-model based posterior samplers, while the benchmark problems provide a testing ground for future algorithmic developments.
MED-PHOct 3, 2025
Application of a Virtual Imaging Framework for Investigating a Deep Learning-Based Reconstruction Method for 3D Quantitative Photoacoustic Computed TomographyRefik Mert Cam, Seonyeong Park, Umberto Villa et al.
Quantitative photoacoustic computed tomography (qPACT) is a promising imaging modality for estimating physiological parameters such as blood oxygen saturation. However, developing robust qPACT reconstruction methods remains challenging due to computational demands, modeling difficulties, and experimental uncertainties. Learning-based methods have been proposed to address these issues but remain largely unvalidated. Virtual imaging (VI) studies are essential for validating such methods early in development, before proceeding to less-controlled phantom or in vivo studies. Effective VI studies must employ ensembles of stochastically generated numerical phantoms that accurately reflect relevant anatomy and physiology. Yet, most prior VI studies for qPACT relied on overly simplified phantoms. In this work, a realistic VI testbed is employed for the first time to assess a representative 3D learning-based qPACT reconstruction method for breast imaging. The method is evaluated across subject variability and physical factors such as measurement noise and acoustic aberrations, offering insights into its strengths and limitations.
MED-PHDec 2, 2024
Learning a Filtered Backprojection Reconstruction Method for Photoacoustic Computed Tomography with Hemispherical Measurement GeometriesPanpan Chen, Seonyeong Park, Refik Mert Cam et al.
In certain three-dimensional (3D) applications of photoacoustic computed tomography (PACT), including \textit{in vivo} breast imaging, hemispherical measurement apertures that enclose the object within their convex hull are employed for data acquisition. Data acquired with such measurement geometries are referred to as \textit{half-scan} data, as only half of a complete spherical measurement aperture is employed. Although previous studies have demonstrated that half-scan data can uniquely and stably reconstruct the sought-after object, no closed-form reconstruction formula for use with half-scan data has been reported. To address this, a semi-analytic reconstruction method in the form of filtered backprojection (FBP), referred to as the half-scan FBP method, is developed in this work. Because the explicit form of the filtering operation in the half-scan FBP method is not currently known, a learning-based method is proposed to approximate it. The proposed method is systematically investigated by use of virtual imaging studies of 3D breast PACT that employ ensembles of numerical breast phantoms and a physics-based model of the data acquisition process. The method is subsequently applied to experimental data acquired in an \textit{in vivo} breast PACT study. The results confirm that the half-scan FBP method can accurately reconstruct 3D images from half-scan data. Importantly, because the sought-after inverse mapping is well-posed, the reconstruction method remains accurate even when applied to data that differ considerably from those employed to learn the filtering operation.
IVFeb 10, 2022
Mining the manifolds of deep generative models for multiple data-consistent solutions of ill-posed tomographic imaging problemsSayantan Bhadra, Umberto Villa, Mark A. Anastasio
Tomographic imaging is in general an ill-posed inverse problem. Typically, a single regularized image estimate of the sought-after object is obtained from tomographic measurements. However, there may be multiple objects that are all consistent with the same measurement data. The ability to generate such alternate solutions is important because it may enable new assessments of imaging systems. In principle, this can be achieved by means of posterior sampling methods. In recent years, deep neural networks have been employed for posterior sampling with promising results. However, such methods are not yet for use with large-scale tomographic imaging applications. On the other hand, empirical sampling methods may be computationally feasible for large-scale imaging systems and enable uncertainty quantification for practical applications. Empirical sampling involves solving a regularized inverse problem within a stochastic optimization framework to obtain alternate data-consistent solutions. In this work, a new empirical sampling method is proposed that computes multiple solutions of a tomographic inverse problem that are consistent with the same acquired measurement data. The method operates by repeatedly solving an optimization problem in the latent space of a style-based generative adversarial network (StyleGAN), and was inspired by the Photo Upsampling via Latent Space Exploration (PULSE) method that was developed for super-resolution tasks. The proposed method is demonstrated and analyzed via numerical studies that involve two stylized tomographic imaging modalities. These studies establish the ability of the method to perform efficient empirical sampling and uncertainty quantification.
NANov 30, 2020
Derivative-Informed Projected Neural Networks for High-Dimensional Parametric Maps Governed by PDEsThomas O'Leary-Roseberry, Umberto Villa, Peng Chen et al.
Many-query problems, arising from uncertainty quantification, Bayesian inversion, Bayesian optimal experimental design, and optimization under uncertainty-require numerous evaluations of a parameter-to-output map. These evaluations become prohibitive if this parametric map is high-dimensional and involves expensive solution of partial differential equations (PDEs). To tackle this challenge, we propose to construct surrogates for high-dimensional PDE-governed parametric maps in the form of projected neural networks that parsimoniously capture the geometry and intrinsic low-dimensionality of these maps. Specifically, we compute Jacobians of these PDE-based maps, and project the high-dimensional parameters onto a low-dimensional derivative-informed active subspace; we also project the possibly high-dimensional outputs onto their principal subspace. This exploits the fact that many high-dimensional PDE-governed parametric maps can be well-approximated in low-dimensional parameter and output subspace. We use the projection basis vectors in the active subspace as well as the principal output subspace to construct the weights for the first and last layers of the neural network, respectively. This frees us to train the weights in only the low-dimensional layers of the neural network. The architecture of the resulting neural network captures to first order, the low-dimensional structure and geometry of the parametric map. We demonstrate that the proposed projected neural network achieves greater generalization accuracy than a full neural network, especially in the limited training data regime afforded by expensive PDE-based parametric maps. Moreover, we show that the number of degrees of freedom of the inner layers of the projected network is independent of the parameter and output dimensions, and high accuracy can be achieved with weight dimension independent of the discretization dimension.
COMay 7, 2019
Multifidelity probability estimation via fusion of estimatorsBoris Kramer, Alexandre Noll Marques, Benjamin Peherstorfer et al.
This paper develops a multifidelity method that enables estimation of failure probabilities for expensive-to-evaluate models via information fusion and importance sampling. The presented general fusion method combines multiple probability estimators with the goal of variance reduction. We use low-fidelity models to derive biasing densities for importance sampling and then fuse the importance sampling estimators such that the fused multifidelity estimator is unbiased and has mean-squared error lower than or equal to that of any of the importance sampling estimators alone. By fusing all available estimators, the method circumvents the challenging problem of selecting the best biasing density and using only that density for sampling. A rigorous analysis shows that the fused estimator is optimal in the sense that it has minimal variance amongst all possible combinations of the estimators. The asymptotic behavior of the proposed method is demonstrated on a convection-diffusion-reaction partial differential equation model for which $10^5$ samples can be afforded. To illustrate the proposed method at scale, we consider a model of a free plane jet and quantify how uncertainties at the flow inlet propagate to a quantity of interest related to turbulent mixing. Compared to an importance sampling estimator that uses the high-fidelity model alone, our multifidelity estimator reduces the required CPU time by 65\% while achieving a similar coefficient of variation.
NAAug 2, 2017
A data scalable augmented Lagrangian KKT preconditioner for large scale inverse problemsNick Alger, Umberto Villa, Tan Bui-Thanh et al.
Current state of the art preconditioners for the reduced Hessian and the Karush-Kuhn-Tucker (KKT) operator for large scale inverse problems are typically based on approximating the reduced Hessian with the regularization operator. However, the quality of this approximation degrades with increasingly informative observations or data. Thus the best case scenario from a scientific standpoint (fully informative data) is the worse case scenario from a computational perspective. In this paper we present an augmented Lagrangian-type preconditioner based on a block diagonal approximation of the augmented upper left block of the KKT operator. The preconditioner requires solvers for two linear subproblems that arise in the augmented KKT operator, which we expect to be much easier to precondition than the reduced Hessian. Analysis of the spectrum of the preconditioned KKT operator indicates that the preconditioner is effective when the regularization is chosen appropriately. In particular, it is effective when the regularization does not over-penalize highly informed parameter modes and does not under-penalize uninformed modes. Finally, we present a numerical study for a large data/low noise Poisson source inversion problem, demonstrating the effectiveness of the preconditioner. In this example, three MINRES iterations on the KKT system with our preconditioner results in a reconstruction with better accuracy than 50 iterations of CG on the reduced Hessian system with regularization preconditioning.
NAJun 20, 2017
Hessian-based adaptive sparse quadrature for infinite-dimensional Bayesian inverse problemsPeng Chen, Umberto Villa, Omar Ghattas
In this work we propose and analyze a Hessian-based adaptive sparse quadrature to compute infinite-dimensional integrals with respect to the posterior distribution in the context of Bayesian inverse problems with Gaussian prior. Due to the concentration of the posterior distribution in the domain of the prior distribution, a prior-based parametrization and sparse quadrature may fail to capture the posterior distribution and lead to erroneous evaluation results. By using a parametrization based on the Hessian of the negative log-posterior, the adaptive sparse quadrature can effectively allocate the quadrature points according to the posterior distribution. A dimension-independent convergence rate of the proposed method is established under certain assumptions on the Gaussian prior and the integrands. Dimension-independent and faster convergence than $O(N^{-1/2})$ is demonstrated for a linear as well as a nonlinear inverse problem whose posterior distribution can be effectively approximated by a Gaussian distribution at the MAP point.