Nicholas Zabaras

h-index52

17papers

2,808citations

Novelty61%

AI Score37

Ranked #94,228 of 194,257 authors (top 49%)#1,311 in ML (top 39%)

17 Papers

3.1LGOct 24, 2021

Deep Learning for Simultaneous Inference of Hydraulic and Transport Properties

Zitong Zhou, Nicholas Zabaras, Daniel M. Tartakovsky

Identifying the heterogeneous conductivity field and reconstructing the contaminant release history are key aspects of subsurface remediation. Achieving these two goals with limited and noisy hydraulic head and concentration measurements is challenging. The obstacles include solving an inverse problem for high-dimensional parameters, and the high-computational cost needed for the repeated forward modeling. We use a convolutional adversarial autoencoder (CAAE) for the parameterization of the heterogeneous non-Gaussian conductivity field with a low-dimensional latent representation. Additionally, we trained a three-dimensional dense convolutional encoder-decoder (DenseED) network to serve as the forward surrogate for the flow and transport processes. Combining the CAAE and DenseED forward surrogate models, the ensemble smoother with multiple data assimilation (ESMDA) algorithm is used to sample from the Bayesian posterior distribution of the unknown parameters, forming a CAAE-DenseED-ESMDA inversion framework. We applied this CAAE-DenseED-ESMDA inversion framework in a three-dimensional contaminant source and conductivity field identification problem. A comparison of the inversion results from CAAE-ESMDA with physical flow and transport simulator and CAAE-DenseED-ESMDA is provided, showing that accurate reconstruction results were achieved with a much higher computational efficiency.

6.6SPAug 17, 2021

Inverse Aerodynamic Design of Gas Turbine Blades using Probabilistic Machine Learning

Sayan Ghosh, Govinda A. Padmanabha, Cheng Peng et al.

One of the critical components in Industrial Gas Turbines (IGT) is the turbine blade. Design of turbine blades needs to consider multiple aspects like aerodynamic efficiency, durability, safety and manufacturing, which make the design process sequential and iterative.The sequential nature of these iterations forces a long design cycle time, ranging from several months to years. Due to the reactionary nature of these iterations, little effort has been made to accumulate data in a manner that allows for deep exploration and understanding of the total design space. This is exemplified in the process of designing the individual components of the IGT resulting in a potential unrealized efficiency. To overcome the aforementioned challenges, we demonstrate a probabilistic inverse design machine learning framework (PMI), to carry out an explicit inverse design. PMI calculates the design explicitly without excessive costly iteration and overcomes the challenges associated with ill-posed inverse problems. In this work, the framework will be demonstrated on inverse aerodynamic design of three-dimensional turbine blades.

12.5MLFeb 4, 2021

Bayesian multiscale deep generative model for the solution of high-dimensional inverse problems

Yingzhi Xia, Nicholas Zabaras

Estimation of spatially-varying parameters for computationally expensive forward models governed by partial differential equations is addressed. A novel multiscale Bayesian inference approach is introduced based on deep probabilistic generative models. Such generative models provide a flexible representation by inferring on each scale a low-dimensional latent encoding while allowing hierarchical parameter generation from coarse- to fine-scales. Combining the multiscale generative model with Markov Chain Monte Carlo (MCMC), inference across scales is achieved enabling us to efficiently obtain posterior parameter samples at various scales. The estimation of coarse-scale parameters using a low-dimensional latent embedding captures global and notable parameter features using an inexpensive but inaccurate solver. MCMC sampling of the fine-scale parameters is enabled by utilizing the posterior information in the immediate coarser-scale. In this way, the global features are identified in the coarse-scale with inference of low-dimensional variables and inexpensive forward computation, and the local features are refined and corrected in the fine-scale. The developed method is demonstrated with two types of permeability estimation for flow in heterogeneous media. One is a Gaussian random field (GRF) with uncertain length scales, and the other is channelized permeability with the two regions defined by different GRFs. The obtained results indicate that the method allows high-dimensional parameter estimation while exhibiting stability, efficiency and accuracy.

25.7LGOct 4, 2020Code

Transformers for Modeling Physical Systems

Nicholas Geneva, Nicholas Zabaras

Transformers are widely used in natural language processing due to their ability to model longer-term dependencies in text. Although these models achieve state-of-the-art performance for many language related tasks, their applicability outside of the natural language processing field has been minimal. In this work, we propose the use of transformer models for the prediction of dynamical systems representative of physical phenomena. The use of Koopman based embeddings provide a unique and powerful method for projecting any dynamical system into a vector representation which can then be predicted by a transformer. The proposed model is able to accurately predict various dynamical systems and outperform classical methods that are commonly used in the scientific machine learning literature.

4.3CHEM-PHSep 29, 2020Code

Physics-Constrained Predictive Molecular Latent Space Discovery with Graph Scattering Variational Autoencoder

Navid Shervani-Tabar, Nicholas Zabaras

Recent advances in artificial intelligence have propelled the development of innovative computational materials modeling and design techniques. Generative deep learning models have been used for molecular representation, discovery, and design. In this work, we assess the predictive capabilities of a molecular generative model developed based on variational inference and graph theory in the small data regime. Physical constraints that encourage energetically stable molecules are proposed. The encoding network is based on the scattering transform with adaptive spectral filters to allow for better generalization of the model. The decoding network is a one-shot graph generative model that conditions atom types on molecular topology. A Bayesian formalism is considered to capture uncertainties in the predictive estimates of molecular properties. The model's performance is evaluated by generating molecules with desired target properties.

15.8COMP-PHJul 31, 2020Code

Solving inverse problems using conditional invertible neural networks

Govinda Anantha Padmanabha, Nicholas Zabaras

Inverse modeling for computing a high-dimensional spatially-varying property field from indirect sparse and noisy observations is a challenging problem. This is due to the complex physical system of interest often expressed in the form of multiscale PDEs, the high-dimensionality of the spatial property of interest, and the incomplete and noisy nature of observations. To address these challenges, we develop a model that maps the given observations to the unknown input field in the form of a surrogate model. This inverse surrogate model will then allow us to estimate the unknown input field for any given sparse and noisy output observations. Here, the inverse mapping is limited to a broad prior distribution of the input field with which the surrogate model is trained. In this work, we construct a two- and three-dimensional inverse surrogate models consisting of an invertible and a conditional neural network trained in an end-to-end fashion with limited training data. The invertible network is developed using a flow-based generative model. The developed inverse surrogate model is then applied for an inversion task of a multiphase flow problem where given the pressure and saturation observations the aim is to recover a high-dimensional non-Gaussian permeability field where the two facies consist of heterogeneous permeability and varying length-scales. For both the two- and three-dimensional surrogate models, the predicted sample realizations of the non-Gaussian permeability field are diverse with the predictive mean being close to the ground truth even when the model is trained with limited data.

2.3LGFeb 24, 2020

Embedded-physics machine learning for coarse-graining and collective variable discovery without data

Markus Schöberl, Nicholas Zabaras, Phaedon-Stelios Koutsourelakis

We present a novel learning framework that consistently embeds underlying physics while bypassing a significant drawback of most modern, data-driven coarse-grained approaches in the context of molecular dynamics (MD), i.e., the availability of big data. The generation of a sufficiently large training dataset poses a computationally demanding task, while complete coverage of the atomistic configuration space is not guaranteed. As a result, the explorative capabilities of data-driven coarse-grained models are limited and may yield biased "predictive" tools. We propose a novel objective based on reverse Kullback-Leibler divergence that fully incorporates the available physics in the form of the atomistic force field. Rather than separating model learning from the data-generation procedure - the latter relies on simulating atomistic motions governed by force fields - we query the atomistic force field at sample configurations proposed by the predictive coarse-grained model. Thus, learning relies on the evaluation of the force field but does not require any MD simulation. The resulting generative coarse-grained model serves as an efficient surrogate model for predicting atomistic configurations and estimating relevant observables. Beyond obtaining a predictive coarse-grained model, we demonstrate that in the discovered lower-dimensional representation, the collective variables (CVs) are related to physicochemical properties, which are essential for gaining understanding of unexplored complex systems. We demonstrate the algorithmic advances in terms of predictive ability and the physical meaning of the revealed CVs for a bimodal potential energy function and the alanine dipeptide.

13.4COMP-PHJun 26, 2019Code

Integration of adversarial autoencoders with residual dense convolutional networks for estimation of non-Gaussian hydraulic conductivities

Shaoxing Mo, Nicholas Zabaras, Xiaoqing Shi et al.

Inverse modeling for the estimation of non-Gaussian hydraulic conductivity fields in subsurface flow and solute transport models remains a challenging problem. This is mainly due to the non-Gaussian property, the non-linear physics, and the fact that many repeated evaluations of the forward model are often required. In this study, we develop a convolutional adversarial autoencoder (CAAE) to parameterize non-Gaussian conductivity fields with heterogeneous conductivity within each facies using a low-dimensional latent representation. In addition, a deep residual dense convolutional network (DRDCN) is proposed for surrogate modeling of forward models with high-dimensional and highly-complex mappings. The two networks are both based on a multilevel residual learning architecture called residual-in-residual dense block. The multilevel residual learning strategy and the dense connection structure ease the training of deep networks, enabling us to efficiently build deeper networks that have an essentially increased capacity for approximating mappings of very high-complexity. The CCAE and DRDCN networks are incorporated into an iterative ensemble smoother to formulate an inversion framework. The numerical experiments performed using 2-D and 3-D solute transport models illustrate the performance of the integrated method. The obtained results indicate that the CAAE is a robust parameterization method for non-Gaussian conductivity fields with different heterogeneity patterns. The DRDCN is able to obtain accurate approximations of the forward models with high-dimensional and highly-complex mappings using relatively limited training data. The CAAE and DRDCN methods together significantly reduce the computation time required to achieve accurate inversion results.

31.4COMP-PHJun 13, 2019Code

Modeling the Dynamics of PDE Systems with Physics-Constrained Deep Auto-Regressive Networks

Nicholas Geneva, Nicholas Zabaras

In recent years, deep learning has proven to be a viable methodology for surrogate modeling and uncertainty quantification for a vast number of physical systems. However, in their traditional form, such models can require a large amount of training data. This is of particular importance for various engineering and scientific applications where data may be extremely expensive to obtain. To overcome this shortcoming, physics-constrained deep learning provides a promising methodology as it only utilizes the governing equations. In this work, we propose a novel auto-regressive dense encoder-decoder convolutional neural network to solve and model non-linear dynamical systems without training data at a computational cost that is potentially magnitudes lower than standard numerical solvers. This model includes a Bayesian framework that allows for uncertainty quantification of the predicted quantities of interest at each time-step. We rigorously test this model on several non-linear transient partial differential equation systems including the turbulence of the Kuramoto-Sivashinsky equation, multi-shock formation and interaction with 1D Burgers' equation and 2D wave dynamics with coupled Burgers' equations. For each system, the predictive results and uncertainty are presented and discussed together with comparisons to the results obtained from traditional numerical analysis methods.

39.4COMP-PHJan 18, 2019Code

Physics-Constrained Deep Learning for High-dimensional Surrogate Modeling and Uncertainty Quantification without Labeled Data

Yinhao Zhu, Nicholas Zabaras, Phaedon-Stelios Koutsourelakis et al.

Surrogate modeling and uncertainty quantification tasks for PDE systems are most often considered as supervised learning problems where input and output data pairs are used for training. The construction of such emulators is by definition a small data problem which poses challenges to deep learning approaches that have been developed to operate in the big data regime. Even in cases where such models have been shown to have good predictive capability in high dimensions, they fail to address constraints in the data implied by the PDE model. This paper provides a methodology that incorporates the governing equations of the physical model in the loss/likelihood functions. The resulting physics-constrained, deep learning models are trained without any labeled data (e.g. employing only input data) and provide comparable predictive responses with data-driven models while obeying the constraints of the problem at hand. This work employs a convolutional encoder-decoder neural network approach as well as a conditional flow-based generative model for the solution of PDEs, surrogate model construction, and uncertainty quantification tasks. The methodology is posed as a minimization problem of the reverse Kullback-Leibler (KL) divergence between the model predictive density and the reference conditional density, where the later is defined as the Boltzmann-Gibbs distribution at a given inverse temperature with the underlying potential relating to the PDE system of interest. The generalization capability of these models to out-of-distribution input is considered. Quantification and interpretation of the predictive uncertainty is provided for a number of problems.

7.3MLSep 18, 2018Code

Predictive Collective Variable Discovery with Deep Bayesian Models

Markus Schöberl, Nicholas Zabaras, Phaedon-Stelios Koutsourelakis

Extending spatio-temporal scale limitations of models for complex atomistic systems considered in biochemistry and materials science necessitates the development of enhanced sampling methods. The potential acceleration in exploring the configurational space by enhanced sampling methods depends on the choice of collective variables (CVs). In this work, we formulate the discovery of CVs as a Bayesian inference problem and consider the CVs as hidden generators of the full-atomistic trajectory. The ability to generate samples of the fine-scale atomistic configurations using limited training data allows us to compute estimates of observables as well as our probabilistic confidence on them. The methodology is based on emerging methodological advances in machine learning and variational inference. The discovered CVs are related to physicochemical properties which are essential for understanding mechanisms especially in unexplored complex systems. We provide a quantitative assessment of the CVs in terms of their predictive ability for alanine dipeptide (ALA-2) and ALA-15 peptide.

7.8MLJul 11, 2018Code

Structured Bayesian Gaussian process latent variable model: applications to data-driven dimensionality reduction and high-dimensional inversion

Steven Atkinson, Nicholas Zabaras

We introduce a methodology for nonlinear inverse problems using a variational Bayesian approach where the unknown quantity is a spatial field. A structured Bayesian Gaussian process latent variable model is used both to construct a low-dimensional generative model of the sample-based stochastic prior as well as a surrogate for the forward evaluation. Its Bayesian formulation captures epistemic uncertainty introduced by the limited number of input and output examples, automatically selects an appropriate dimensionality for the learned latent representation of the data, and rigorously propagates the uncertainty of the data-driven dimensionality reduction of the stochastic space through the forward model surrogate. The structured Gaussian process model explicitly leverages spatial information for an informative generative prior to improve sample efficiency while achieving computational tractability through Kronecker product decompositions of the relevant kernel matrices. Importantly, the Bayesian inversion is carried out by solving a variational optimization problem, replacing traditional computationally-expensive Monte Carlo sampling. The methodology is demonstrated on an elliptic PDE and is shown to return well-calibrated posteriors and is tractable with latent spaces with over 100 dimensions.

15.8FLU-DYNJul 8, 2018Code

Quantifying model form uncertainty in Reynolds-averaged turbulence models with Bayesian deep neural networks

Nicholas Geneva, Nicholas Zabaras

Data-driven methods for improving turbulence modeling in Reynolds-Averaged Navier-Stokes (RANS) simulations have gained significant interest in the computational fluid dynamics community. Modern machine learning algorithms have opened up a new area of black-box turbulence models allowing for the tuning of RANS simulations to increase their predictive accuracy. While several data-driven turbulence models have been reported, the quantification of the uncertainties introduced has mostly been neglected. Uncertainty quantification for such data-driven models is essential since their predictive capability rapidly declines as they are tested for flow physics that deviate from that in the training data. In this work, we propose a novel data-driven framework that not only improves RANS predictions but also provides probabilistic bounds for fluid quantities such as velocity and pressure. The uncertainties capture both model form uncertainty as well as epistemic uncertainty induced by the limited training data. An invariant Bayesian deep neural network is used to predict the anisotropic tensor component of the Reynolds stress. This model is trained using Stein variational gradient decent algorithm. The computed uncertainty on the Reynolds stress is propagated to the quantities of interest by vanilla Monte Carlo simulation. Results are presented for two test cases that differ geometrically from the training flows at several different Reynolds numbers. The prediction enhancement of the data-driven model is discussed as well as the associated probabilistic bounds for flow properties of interest. Ultimately this framework allows for a quantitative measurement of model confidence and uncertainty quantification for flows in which no high-fidelity observations or prior knowledge is available.

5.5MLMay 22, 2018

Structured Bayesian Gaussian process latent variable model

Steven Atkinson, Nicholas Zabaras

We introduce a Bayesian Gaussian process latent variable model that explicitly captures spatial correlations in data using a parameterized spatial kernel and leveraging structure-exploiting algebra on the model covariance matrices for computational tractability. Inference is made tractable through a collapsed variational bound with similar computational complexity to that of the traditional Bayesian GP-LVM. Inference over partially-observed test cases is achieved by optimizing a "partially-collapsed" bound. Modeling high-dimensional time series systems is enabled through use of a dynamical GP latent variable prior. Examples imputing missing data on images and super-resolution imputation of missing video frames demonstrate the model.

31.4COMP-PHJan 21, 2018

Bayesian Deep Convolutional Encoder-Decoder Networks for Surrogate Modeling and Uncertainty Quantification

Yinhao Zhu, Nicholas Zabaras

We are interested in the development of surrogate models for uncertainty quantification and propagation in problems governed by stochastic PDEs using a deep convolutional encoder-decoder network in a similar fashion to approaches considered in deep learning for image-to-image regression tasks. Since normal neural networks are data intensive and cannot provide predictive uncertainty, we propose a Bayesian approach to convolutional neural nets. A recently introduced variational gradient descent algorithm based on Stein's method is scaled to deep convolutional networks to perform approximate Bayesian inference on millions of uncertain network parameters. This approach achieves state of the art performance in terms of predictive accuracy and uncertainty quantification in comparison to other approaches in Bayesian neural networks as well as techniques that include Gaussian processes and ensemble methods even when the training data size is relatively small. To evaluate the performance of this approach, we consider standard uncertainty quantification benchmark problems including flow in heterogeneous media defined in terms of limited data-driven permeability realizations. The performance of the surrogate model developed is very good even though there is no underlying structure shared between the input (permeability) and output (flow/pressure) fields as is often the case in the image-to-image regression models used in computer vision problems. Studies are performed with an underlying stochastic input dimensionality up to $4,225$ where most other uncertainty quantification methods fail. Uncertainty propagation tasks are considered and the predictive output Bayesian statistics are compared to those obtained with Monte Carlo estimates.

7.8MLMay 26, 2016

Predictive Coarse-Graining

Markus Schöberl, Nicholas Zabaras, Phaedon-Stelios Koutsourelakis

We propose a data-driven, coarse-graining formulation in the context of equilibrium statistical mechanics. In contrast to existing techniques which are based on a fine-to-coarse map, we adopt the opposite strategy by prescribing a probabilistic coarse-to-fine map. This corresponds to a directed probabilistic model where the coarse variables play the role of latent generators of the fine scale (all-atom) data. From an information-theoretic perspective, the framework proposed provides an improvement upon the relative entropy method and is capable of quantifying the uncertainty due to the information loss that unavoidably takes place during the CG process. Furthermore, it can be readily extended to a fully Bayesian model where various sources of uncertainties are reflected in the posterior of the model parameters. The latter can be used to produce not only point estimates of fine-scale reconstructions or macroscopic observables, but more importantly, predictive posterior distributions on these quantities. Predictive posterior distributions reflect the confidence of the model as a function of the amount of data and the level of coarse-graining. The issues of model complexity and model selection are seamlessly addressed by employing a hierarchical prior that favors the discovery of sparse solutions, revealing the most prominent features in the coarse-grained model. A flexible and parallelizable Monte Carlo - Expectation-Maximization (MC-EM) scheme is proposed for carrying out inference and learning tasks. A comparative assessment of the proposed methodology is presented for a lattice spin system and the SPC/E water model.

2.7MLOct 21, 2014Code

Variational Reformulation of Bayesian Inverse Problems

Panagiotis Tsilifis, Ilias Bilionis, Ioannis Katsounaros et al.

The classical approach to inverse problems is based on the optimization of a misfit function. Despite its computational appeal, such an approach suffers from many shortcomings, e.g., non-uniqueness of solutions, modeling prior knowledge, etc. The Bayesian formalism to inverse problems avoids most of the difficulties encountered by the optimization approach, albeit at an increased computational cost. In this work, we use information theoretic arguments to cast the Bayesian inference problem in terms of an optimization problem. The resulting scheme combines the theoretical soundness of fully Bayesian inference with the computational efficiency of a simple optimization.