LGSep 29, 2023Code
Multi-Resolution Active Learning of Fourier Neural OperatorsShibo Li, Xin Yu, Wei Xing et al.
Fourier Neural Operator (FNO) is a popular operator learning framework. It not only achieves the state-of-the-art performance in many tasks, but also is efficient in training and prediction. However, collecting training data for the FNO can be a costly bottleneck in practice, because it often demands expensive physical simulations. To overcome this problem, we propose Multi-Resolution Active learning of FNO (MRA-FNO), which can dynamically select the input functions and resolutions to lower the data cost as much as possible while optimizing the learning efficiency. Specifically, we propose a probabilistic multi-resolution FNO and use ensemble Monte-Carlo to develop an effective posterior inference algorithm. To conduct active learning, we maximize a utility-cost ratio as the acquisition function to acquire new examples and resolutions at each step. We use moment matching and the matrix determinant lemma to enable tractable, efficient utility computation. Furthermore, we develop a cost annealing framework to avoid over-penalizing high-resolution queries at the early stage. The over-penalization is severe when the cost difference is significant between the resolutions, which renders active learning often stuck at low-resolution queries and inferior performance. Our method overcomes this problem and applies to general multi-fidelity active learning and optimization problems. We have shown the advantage of our method in several benchmark operator learning tasks. The code is available at https://github.com/shib0li/MRA-FNO.
NAMay 1, 2017
Effectively Subsampled Quadratures For Least Squares Polynomial ApproximationsPranay Seshadri, Akil Narayan, Sankaran Mahadevan
This paper proposes a new deterministic sampling strategy for constructing polynomial chaos approximations for expensive physics simulation models. The proposed approach, effectively subsampled quadratures involves sparsely subsampling an existing tensor grid using QR column pivoting. For polynomial interpolation using hyperbolic or total order sets, we then solve the following square least squares problem. For polynomial approximation, we use a column pruning heuristic that removes columns based on the highest total orders and then solves the tall least squares problem. While we provide bounds on the condition number of such tall submatrices, it is difficult to ascertain how column pruning effects solution accuracy as this is problem specific. We conclude with numerical experiments on an analytical function and a model piston problem that show the efficacy of our approach compared with randomized subsampling. We also show an example where this method fails.
NAMar 22, 2019
Polynomial chaos expansions for dependent random variablesJohn Jakeman, Fabian Franzelin, Akil Narayan et al.
Polynomial chaos expansions (PCE) are well-suited to quantifying uncertainty in models parameterized by independent random variables. The assumption of independence leads to simple strategies for evaluating PCE coefficients. In contrast, the application of PCE to models of dependent variables is much more challenging. Three approaches can be used. The first approach uses mapping methods where measure transformations, such as the Nataf and Rosenblatt transformation, can be used to map dependent random variables to independent ones; however we show that this can significantly degrade performance since the Jacobian of the map must be approximated. A second strategy is the class of dominating support methods which build PCE using independent random variables whose distributional support dominates the support of the true dependent joint density; we provide evidence that this approach appears to produce approximations with suboptimal accuracy. A third approach, the novel method proposed here, uses Gram-Schmidt orthogonalization (GSO) to numerically compute orthonormal polynomials for the dependent random variables. This approach has been used successfully when solving differential equations using the intrusive stochastic Galerkin method, and in this paper we use GSO to build PCE using a non-intrusive stochastic collocation method. The stochastic collocation method treats the model as a black box and builds approximations of model output from a set of samples. Building PCE from samples can introduce ill-conditioning which does not plague stochastic Galerkin methods. To mitigate this ill-conditioning we generate weighted Leja sequences, which are nested sample sets, to build accurate polynomial interpolants. We show that our proposed approach produces PCE which are orders of magnitude more accurate than PCE constructed using mapping or dominating support methods.
NAApr 17, 2018
Numerical Integration in Multiple Dimensions with Designed QuadratureVahid Keshavarzzadeh, Robert M. Kirby, Akil Narayan
We present a systematic computational framework for generating positive quadrature rules in multiple dimensions on general geometries. A direct moment-matching formulation that enforces exact integration on polynomial subspaces yields nonlinear conditions and geometric constraints on nodes and weights. We use penalty methods to address the geometric constraints, and subsequently solve a quadratic minimization problem via the Gauss-Newton method. Our analysis provides guidance on requisite sizes of quadrature rules for a given polynomial subspace, and furnishes useful user-end stability bounds on error in the quadrature rule in the case when the polynomial moment conditions are violated by a small amount due to, e.g., finite precision limitations or stagnation of the optimization procedure. We present several numerical examples investigating optimal low-degree quadrature rules, Lebesgue constants, and 100-dimensional quadrature. Our capstone examples compare our quadrature approach to popular alternatives, such as sparse grids and quasi-Monte Carlo methods, for problems in linear elasticity and topology optimization.
NAAug 30, 2018
Parametric Topology Optimization with Multi-Resolution Finite Element ModelsVahid Keshavarzzadeh, Robert M. Kirby, Akil Narayan
We present a methodical procedure for topology optimization under uncertainty with multi-resolution finite element models. We use our framework in a bi-fidelity setting where a coarse and a fine mesh corresponding to low- and high-resolution models are available. The inexpensive low-resolution model is used to explore the parameter space and approximate the parameterized high-resolution model and its sensitivity where parameters are considered in both structural load and stiffness. We provide error bounds for bi-fidelity finite element (FE) approximations and their sensitivities and conduct numerical studies to verify these theoretical estimates. We demonstrate our approach on benchmark compliance minimization problems where we show significant reduction in computational cost for expensive problems such as topology optimization under manufacturing variability while generating almost identical designs to those obtained with single resolution mesh. We also compute the parametric Von-Mises stress for the generated designs via our bi-fidelity FE approximation and compare them with standard Monte Carlo simulations. The implementation of our algorithm which extends the well-known 88-line topology optimization code in MATLAB is provided.
NAApr 19, 2022
Proximal Implicit ODE Solvers for Accelerating Learning Neural ODEsJustin Baker, Hedi Xia, Yiwei Wang et al.
Learning neural ODEs often requires solving very stiff ODE systems, primarily using explicit adaptive step size ODE solvers. These solvers are computationally expensive, requiring the use of tiny step sizes for numerical stability and accuracy guarantees. This paper considers learning neural ODEs using implicit ODE solvers of different orders leveraging proximal operators. The proximal implicit solver consists of inner-outer iterations: the inner iterations approximate each implicit update step using a fast optimization algorithm, and the outer iterations solve the ODE system over time. The proximal implicit ODE solver guarantees superiority over explicit solvers in numerical stability and computational efficiency. We validate the advantages of proximal implicit solvers over existing popular neural ODE solvers on various challenging benchmark tasks, including learning continuous-depth graph neural networks and continuous normalizing flows.
NAJan 29, 2016
A Christoffel function weighted least squares algorithm for collocation approximationsAkil Narayan, John D. Jakeman, Tao Zhou
We propose, theoretically investigate, and numerically validate an algorithm for the Monte Carlo solution of least-squares polynomial approximation problems in a collocation frame- work. Our method is motivated by generalized Polynomial Chaos approximation in uncertainty quantification where a polynomial approximation is formed from a combination of orthogonal polynomials. A standard Monte Carlo approach would draw samples according to the density of orthogonality. Our proposed algorithm samples with respect to the equilibrium measure of the parametric domain, and subsequently solves a weighted least-squares problem, with weights given by evaluations of the Christoffel function. We present theoretical analysis to motivate the algorithm, and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest.
NAFeb 22, 2016
A generalized sampling and preconditioning scheme for sparse approximation of polynomial chaos expansionsJohn D. Jakeman, Akil Narayan, Tao Zhou
In this paper we propose an algorithm for recovering sparse orthogonal polynomials using stochastic collocation. Our approach is motivated by the desire to use generalized polynomial chaos expansions (PCE) to quantify uncertainty in models subject to uncertain input parameters. The standard sampling approach for recovering sparse polynomials is to use Monte Carlo (MC) sampling of the density of orthogonality. However MC methods result in poor function recovery when the polynomial degree is high. Here we propose a general algorithm that can be applied to any admissible weight function on a bounded domain and a wide class of exponential weight functions defined on unbounded domains. Our proposed algorithm samples with respect to the weighted equilibrium measure of the parametric domain, and subsequently solves a preconditioned $\ell^1$-minimization problem, where the weights of the diagonal preconditioning matrix are given by evaluations of the Christoffel function. We present theoretical analysis to motivate the algorithm, and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. Numerical examples are also provided that demonstrate that our proposed Christoffel Sparse Approximation algorithm leads to comparable or improved accuracy even when compared with Legendre and Hermite specific algorithms.
NAAug 30, 2018
Compressed sensing with sparse corruptions: Fault-tolerant sparse collocation approximationsBen Adcock, Anyi Bao, John D. Jakeman et al.
The recovery of approximately sparse or compressible coefficients in a Polynomial Chaos Expansion is a common goal in modern parametric uncertainty quantification (UQ). However, relatively little effort in UQ has been directed toward theoretical and computational strategies for addressing the sparse corruptions problem, where a small number of measurements are highly corrupted. Such a situation has become pertinent today since modern computational frameworks are sufficiently complex with many interdependent components that may introduce hardware and software failures, some of which can be difficult to detect and result in a highly polluted simulation result. In this paper we present a novel compressive sampling-based theoretical analysis for a regularized $\ell^1$ minimization algorithm that aims to recover sparse expansion coefficients in the presence of measurement corruptions. Our recovery results are uniform, and prescribe algorithmic regularization parameters in terms of a user-defined a priori estimate on the ratio of measurements that are believed to be corrupted. We also propose an iteratively reweighted optimization algorithm that automatically refines the value of the regularization parameter, and empirically produces superior results. Our numerical results test our framework on several medium-to-high dimensional examples of solutions to parameterized differential equations, and demonstrate the effectiveness of our approach.
NAFeb 24, 2018
A gradient enhanced $\ell_1$-minimization for sparse approximation of polynomial chaos expansionsLing Guo, Akil Narayan, Tao Zhou
We investigate a gradient-enhanced $\ell_1$-minimization for constructing sparse polynomial chaos expansions. In addition to function evaluations, measurements of the function gradient is also included to accelerate the identification of expansion coefficients. By designing appropriate preconditioners to the measurement matrix, we show gradient-enhanced $\ell_1$ minimization leads to stable and accurate coefficient recovery. The framework for designing preconditioners is quite general and it applies to recover of functions whose domain is bounded or unbounded. Comparisons between the gradient enhanced approach and the standard $\ell_1$-minimization are also presented and numerical examples suggest that the inclusion of derivative information can guarantee sparse recovery at a reduced computational cost.
NAAug 10, 2018
Convergence Acceleration for Time Dependent Parametric Multifidelity ModelsVahid Keshavarzzadeh, Robert M. Kirby, Akil Narayan
We present a numerical method for convergence acceleration for multifidelity models of parameterized ordinary differential equations. The hierarchy of models is defined as trajectories computed using different timesteps in a time integration scheme. Our first contribution is in novel analysis of the multifidelity procedure, providing a convergence estimate. Our second contribution is development of a three-step algorithm that uses multifidelity surrogates to accelerate convergence: step one uses a multifidelity procedure at three levels to obtain accurate predictions using inexpensive (large timestep) models. Step two uses high-order splines to construct continuous trajectories over time. Finally, step three combines spline predictions at three levels to infer an order of convergence and compute a sequence transformation prediction (in particular we use Richardson extrapolation) that achieves superior error. We demonstrate our procedure on linear and nonlinear systems of parameterized ordinary differential equations.
LGJul 8, 2022
Nonparametric Embeddings of Sparse High-Order Interaction EventsZheng Wang, Yiming Xu, Conor Tillinghast et al.
High-order interaction events are common in real-world applications. Learning embeddings that encode the complex relationships of the participants from these events is of great importance in knowledge mining and predictive tasks. Despite the success of existing approaches, e.g. Poisson tensor factorization, they ignore the sparse structure underlying the data, namely the occurred interactions are far less than the possible interactions among all the participants. In this paper, we propose Nonparametric Embeddings of Sparse High-order interaction events (NESH). We hybridize a sparse hypergraph (tensor) process and a matrix Gaussian process to capture both the asymptotic structural sparsity within the interactions and nonlinear temporal relationships between the participants. We prove strong asymptotic bounds (including both a lower and an upper bound) of the sparsity ratio, which reveals the asymptotic properties of the sampled structure. We use batch-normalization, stick-breaking construction, and sparse variational GP approximations to develop an efficient, scalable model inference algorithm. We demonstrate the advantage of our approach in several real-world applications.
NAJul 13, 2016
Stochastic collocation methods via $L_1$ minimization using randomized quadraturesLing Guo, Akil Narayan, Tao Zhou et al.
In this work, we discuss the problem of approximating a multivariate function via $\ell_1$ minimization method, using a random chosen sub-grid of the corresponding tensor grid of Gaussian points. The independent variables of the function are assumed to be random variables, and thus, the framework provides a non-intrusive way to construct the generalized polynomial chaos expansions, stemming from the motivating application of Uncertainty Quantification (UQ). We provide theoretical analysis on the validity of the approach. The framework includes both the bounded measures such as the uniform and the Chebyshev measure, and the unbounded measures which include the Gaussian measure. Several numerical examples are given to confirm the theoretical results.
NAAug 10, 2018
Generation of Nested Quadrature Rules for Generic Weight Functions via Numerical Optimization: Application to Sparse GridsVahid Keshavarzzadeh, Robert M. Kirby, Akil Narayan
We present a numerical framework for computing nested quadrature rules for various weight functions. The well-known Kronrod method extends the Gauss-Legendre quadrature by adding new optimal nodes to the existing Gauss nodes for integration of higher order polynomials. Our numerical method generalizes the Kronrod rule for any continuous probability density function on real line with finite moments. We develop a bi-level optimization scheme to solve moment-matching conditions for two levels of main and nested rule and use a penalty method to enforce the constraints on the limits of the nodes and weights. We demonstrate our nested quadrature rule for probability measures on finite/infinite and symmetric/asymmetric supports. We generate Gauss-Kronrod-Patterson rules by slightly modifying our algorithm and present results associated with Chebyshev polynomials which are not reported elsewhere. We finally show the application of our nested rules in construction of sparse grids where we validate the accuracy and efficiency of such nested quadrature-based sparse grids on parameterized boundary and initial value problems in multiple dimensions.
NAApr 27, 2017
Computation of Induced Orthogonal Polynomial DistributionsAkil Narayan
We provide a robust and general algorithm for computing distribution functions associated to induced orthogonal polynomial measures. We leverage several tools for orthogonal polynomials to provide a spectrally-accurate method for a broad class of measures, which is stable for polynomial degrees up to at least degree 1000. Paired with other standard tools such as a numerical root-finding algorithm and inverse transform sampling, this provides a methodology for generating random samples from an induced orthogonal polynomial measure. Generating samples from this measure is one ingredient in optimal numerical methods for certain types of multivariate polynomial approximation. For example, sampling from induced distributions for weighted discrete least-squares approximation has recently been shown to yield convergence guarantees with a minimal number of samples. We also provide publicly-available code that implements the algorithms in this paper for sampling from induced distributions.
NANov 1, 2017
Generation and application of multivariate polynomial quadrature rulesJohn D. Jakeman, Akil Narayan
The search for multivariate quadrature rules of minimal size with a specified polynomial accuracy has been the topic of many years of research. Finding such a rule allows accurate integration of moments, which play a central role in many aspects of scientific computing with complex models. The contribution of this paper is twofold. First, we provide novel mathematical analysis of the polynomial quadrature problem that provides a lower bound for the minimal possible number of nodes in a polynomial rule with specified accuracy. We give concrete but simplistic multivariate examples where a minimal quadrature rule can be designed that achieves this lower bound, along with situations that showcase when it is not possible to achieve this lower bound. Our second main contribution comes in the formulation of an algorithm that is able to efficiently generate multivariate quadrature rules with positive weights on non-tensorial domains. Our tests show success of this procedure in up to 20 dimensions. We test our method on applications to dimension reduction and chemical kinetics problems, including comparisons against popular alternatives such as sparse grids, Monte Carlo and quasi Monte Carlo sequences, and Stroud rules. The quadrature rules computed in this paper outperform these alternatives in almost all scenarios.
LGOct 23, 2022
Meta Learning of Interface Conditions for Multi-Domain Physics-Informed Neural NetworksShibo Li, Michael Penwarden, Yiming Xu et al.
Physics-informed neural networks (PINNs) are emerging as popular mesh-free solvers for partial differential equations (PDEs). Recent extensions decompose the domain, apply different PINNs to solve the problem in each subdomain, and stitch the subdomains at the interface. Thereby, they can further alleviate the problem complexity, reduce the computational cost, and allow parallelization. However, the performance of multi-domain PINNs is sensitive to the choice of the interface conditions. While quite a few conditions have been proposed, there is no suggestion about how to select the conditions according to specific problems. To address this gap, we propose META Learning of Interface Conditions (METALIC), a simple, efficient yet powerful approach to dynamically determine appropriate interface conditions for solving a family of parametric PDEs. Specifically, we develop two contextual multi-arm bandit (MAB) models. The first one applies to the entire training course, and online updates a Gaussian process (GP) reward that given the PDE parameters and interface conditions predicts the performance. We prove a sub-linear regret bound for both UCB and Thompson sampling, which in theory guarantees the effectiveness of our MAB. The second one partitions the training into two stages, one is the stochastic phase and the other deterministic phase; we update a GP reward for each phase to enable different condition selections at the two stages to further bolster the flexibility and performance. We have shown the advantage of METALIC on four bench-mark PDE families.
CVOct 19, 2012
Approximating the Weil-Petersson Metric Geodesics on the Universal Teichmüller space by Singular SolutionsSergey Kushnarev, Akil Narayan
We propose and investigate a numerical shooting method for computing geodesics in the Weil-Petersson ($WP$) metric on the universal Teichmüller space T(1). This space, or rather the coset subspace $\PSL_2(\R)\backslash\Diff(S^1)$, has another realization as the space of smooth, simple closed planar curves modulo translations and scalings. This alternate identification of T(1) is a convenient metrization of the space of shapes and provides an immediate application for our algorithm in computer vision. The geodesic equation on T(1) with the $WP$ metric is EPDiff($S^1$), the Euler-Poincare equation on the group of diffeomorphisms of the circle $S^1$, and admits a class of soliton-like solutions named Teichons. Our method relies on approximating the geodesic with these teichon solutions, which have momenta given by a finite linear combination of delta functions. The geodesic equation for this simpler set of solutions is more tractable from the numerical point of view. With a robust numerical integration of this equation, we formulate a shooting method utilizing a cross-ratio matching term. Several examples of geodesics in the space of shapes are demonstrated.
NAMar 16, 2017
Offline-Enhanced Reduced Basis Method through adaptive construction of the Surrogate Parameter DomainJiahua Jiang, Yanlai Chen, Akil Narayan
The Reduced Basis Method (RBM) is a popular certified model reduction approach for solving parametrized partial differential equations. One critical stage of the \textit{offline} portion of the algorithm is a greedy algorithm, requiring maximization of an error estimate over parameter space. In practice this maximization is usually performed by replacing the parameter domain continuum with a discrete "training" set. When the dimension of parameter space is large, it is necessary to significantly increase the size of this training set in order to effectively search parameter space. Large training sets diminish the attractiveness of RBM algorithms since this proportionally increases the cost of the offline {phase}. In this work we propose novel strategies for offline RBM algorithms that mitigate the computational difficulty of maximizing error estimates over a training set. The main idea is to identify a subset of the training set, a "surrogate parameter domain" (SPD), on which to perform greedy algorithms. The SPD's we construct are much smaller in size than the full training set, yet our examples suggest that they are accurate enough to represent the solution manifold of interest at the current offline RBM iteration. We propose two algorithms to construct the SPD: Our first algorithm, the Successive Maximization Method (SMM) method, is inspired by inverse transform sampling for non-standard univariate probability distributions. The second constructs an SPD by identifying pivots in the Cholesky Decomposition of an approximate error correlation matrix. We demonstrate the algorithm through numerical experiments, showing that the algorithm is capable of accelerating offline RBM procedures without degrading accuracy, assuming that the solution manifold has low Kolmogorov width.
LGApr 8, 2022
Weight Matrix Dimensionality Reduction in Deep Learning via Kronecker Multi-layer ArchitecturesJarom D. Hogue, Robert M. Kirby, Akil Narayan
Deep learning using neural networks is an effective technique for generating models of complex data. However, training such models can be expensive when networks have large model capacity resulting from a large number of layers and nodes. For training in such a computationally prohibitive regime, dimensionality reduction techniques ease the computational burden, and allow implementations of more robust networks. We propose a novel type of such dimensionality reduction via a new deep learning architecture based on fast matrix multiplication of a Kronecker product decomposition; in particular our network construction can be viewed as a Kronecker product-induced sparsification of an "extended" fully connected network. Analysis and practical examples show that this architecture allows a neural network to be trained and implemented with a significant reduction in computational time and resources, while achieving a similar error level compared to a traditional feedforward neural network.
LGNov 26, 2025
SUPN: Shallow Universal Polynomial NetworksZachary Morrow, Michael Penwarden, Brian Chen et al.
Deep neural networks (DNNs) and Kolmogorov-Arnold networks (KANs) are popular methods for function approximation due to their flexibility and expressivity. However, they typically require a large number of trainable parameters to produce a suitable approximation. Beyond making the resulting network less transparent, overparameterization creates a large optimization space, likely producing local minima in training that have quite different generalization errors. In this case, network initialization can have an outsize impact on the model's out-of-sample accuracy. For these reasons, we propose shallow universal polynomial networks (SUPNs). These networks replace all but the last hidden layer with a single layer of polynomials with learnable coefficients, leveraging the strengths of DNNs and polynomials to achieve sufficient expressivity with far fewer parameters. We prove that SUPNs converge at the same rate as the best polynomial approximation of the same degree, and we derive explicit formulas for quasi-optimal SUPN parameters. We complement theory with an extensive suite of numerical experiments involving SUPNs, DNNs, KANs, and polynomial projection in one, two, and ten dimensions, consisting of over 13,000 trained models. On the target functions we numerically studied, for a given number of trainable parameters, the approximation error and variability are often lower for SUPNs than for DNNs and KANs by an order of magnitude. In our examples, SUPNs even outperform polynomial projection on non-smooth functions.
NAMar 6, 2024
TGPT-PINN: Nonlinear model reduction with transformed GPT-PINNsYanlai Chen, Yajie Ji, Akil Narayan et al.
We introduce the Transformed Generative Pre-Trained Physics-Informed Neural Networks (TGPT-PINN) for accomplishing nonlinear model order reduction (MOR) of transport-dominated partial differential equations in an MOR-integrating PINNs framework. Building on the recent development of the GPT-PINN that is a network-of-networks design achieving snapshot-based model reduction, we design and test a novel paradigm for nonlinear model reduction that can effectively tackle problems with parameter-dependent discontinuities. Through incorporation of a shock-capturing loss function component as well as a parameter-dependent transform layer, the TGPT-PINN overcomes the limitations of linear model reduction in the transport-dominated regime. We demonstrate this new capability for nonlinear model reduction in the PINNs framework by several nontrivial parametric partial differential equations.
LGOct 17, 2024
Arbitrarily-Conditioned Multi-Functional Diffusion for Multi-Physics EmulationDa Long, Zhitong Xu, Guang Yang et al.
Modern physics simulation often involves multiple functions of interests, and traditional numerical approaches are known to be complex and computationally costly. While machine learning-based surrogate models can offer significant cost reductions, most focus on a single task, such as forward prediction, and typically lack uncertainty quantification -- an essential component in many applications. To overcome these limitations, we propose Arbitrarily-Conditioned Multi-Functional Diffusion (ACM-FD), a versatile probabilistic surrogate model for multi-physics emulation. ACM-FD can perform a wide range of tasks within a single framework, including forward prediction, various inverse problems, and simulating data for entire systems or subsets of quantities conditioned on others. Specifically, we extend the standard Denoising Diffusion Probabilistic Model (DDPM) for multi-functional generation by modeling noise as Gaussian processes (GP). We propose a random-mask based, zero-regularized denoising loss to achieve flexible and robust conditional generation. We induce a Kronecker product structure in the GP covariance matrix, substantially reducing the computational cost and enabling efficient training and sampling. We demonstrate the effectiveness of ACM-FD across several fundamental multi-physics systems.
MLJul 3, 2025
Hybrid least squares for learning functions from highly noisy dataBen Adcock, Bernhard Hientzsch, Akil Narayan et al.
Motivated by the need for efficient estimation of conditional expectations, we consider a least-squares function approximation problem with heavily polluted data. Existing methods that are powerful in the small noise regime are suboptimal when large noise is present. We propose a hybrid approach that combines Christoffel sampling with certain types of optimal experimental design to address this issue. We show that the proposed algorithm enjoys appropriate optimality properties for both sample point generation and noise mollification, leading to improved computational efficiency and sample complexity compared to existing methods. We also extend the algorithm to convex-constrained settings with similar theoretical guarantees. When the target function is defined as the expectation of a random field, we extend our approach to leverage adaptive random subspaces and establish results on the approximation capacity of the adaptive procedure. Our theoretical findings are supported by numerical studies on both synthetic data and on a more challenging stochastic simulation problem in computational finance.
LGJun 30, 2024
Kernel Neural Operators (KNOs) for Scalable, Memory-efficient, Geometrically-flexible Operator LearningMatthew Lowery, John Turnage, Zachary Morrow et al.
This paper introduces the Kernel Neural Operator (KNO), a provably convergent operator-learning architecture that utilizes compositions of deep kernel-based integral operators for function-space approximation of operators (maps from functions to functions). The KNO decouples the choice of kernel from the numerical integration scheme (quadrature), thereby naturally allowing for operator learning with explicitly-chosen trainable kernels on irregular geometries. On irregular domains, this allows the KNO to utilize domain-specific quadrature rules. To help ameliorate the curse of dimensionality, we also leverage an efficient dimension-wise factorization algorithm on regular domains. More importantly, the ability to explicitly specify kernels also allows the use of highly expressive, non-stationary, neural anisotropic kernels whose parameters are computed by training neural networks. Numerical results demonstrate that on existing benchmarks the training and test accuracy of KNOs is comparable to or higher than popular operator learning techniques while typically using an order of magnitude fewer trainable parameters, with the more expressive kernels proving important to attaining high accuracy. KNOs thus facilitate low-memory, geometrically-flexible, deep operator learning, while retaining the implementation simplicity and transparency of traditional kernel methods from both scientific computing and machine learning.
LGFeb 24, 2022
Learning POD of Complex Dynamics Using Heavy-ball Neural ODEsJustin Baker, Elena Cherkaev, Akil Narayan et al.
Proper orthogonal decomposition (POD) allows reduced-order modeling of complex dynamical systems at a substantial level, while maintaining a high degree of accuracy in modeling the underlying dynamical systems. Advances in machine learning algorithms enable learning POD-based dynamics from data and making accurate and fast predictions of dynamical systems. In this paper, we leverage the recently proposed heavy-ball neural ODEs (HBNODEs) [Xia et al. NeurIPS, 2021] for learning data-driven reduced-order models (ROMs) in the POD context, in particular, for learning dynamics of time-varying coefficients generated by the POD analysis on training snapshots generated from solving full order models. HBNODE enjoys several practical advantages for learning POD-based ROMs with theoretical guarantees, including 1) HBNODE can learn long-term dependencies effectively from sequential observations and 2) HBNODE is computationally efficient in both training and testing. We compare HBNODE with other popular ROMs on several complex dynamical systems, including the von Kármán Street flow, the Kurganov-Petrova-Popov equation, and the one-dimensional Euler equations for fluids modeling.
COMP-PHOct 26, 2021
A Metalearning Approach for Physics-Informed Neural Networks (PINNs): Application to Parameterized PDEsMichael Penwarden, Shandian Zhe, Akil Narayan et al.
Physics-informed neural networks (PINNs) as a means of discretizing partial differential equations (PDEs) are garnering much attention in the Computational Science and Engineering (CS&E) world. At least two challenges exist for PINNs at present: an understanding of accuracy and convergence characteristics with respect to tunable parameters and identification of optimization strategies that make PINNs as efficient as other computational science tools. The cost of PINNs training remains a major challenge of Physics-informed Machine Learning (PiML) - and, in fact, machine learning (ML) in general. This paper is meant to move towards addressing the latter through the study of PINNs on new tasks, for which parameterized PDEs provides a good testbed application as tasks can be easily defined in this context. Following the ML world, we introduce metalearning of PINNs with application to parameterized PDEs. By introducing metalearning and transfer learning concepts, we can greatly accelerate the PINNs optimization process. We present a survey of model-agnostic metalearning, and then discuss our model-aware metalearning applied to PINNs as well as implementation considerations and algorithmic complexity. We then test our approach on various canonical forward parameterized PDEs that have been presented in the emerging PINNs literature.
LGOct 16, 2021
Meta-Learning with Adjoint MethodsShibo Li, Zheng Wang, Akil Narayan et al.
Model Agnostic Meta Learning (MAML) is widely used to find a good initialization for a family of tasks. Despite its success, a critical challenge in MAML is to calculate the gradient w.r.t. the initialization of a long training trajectory for the sampled tasks, because the computation graph can rapidly explode and the computational cost is very expensive. To address this problem, we propose Adjoint MAML (A-MAML). We view gradient descent in the inner optimization as the evolution of an Ordinary Differential Equation (ODE). To efficiently compute the gradient of the validation loss w.r.t. the initialization, we use the adjoint method to construct a companion, backward ODE. To obtain the gradient w.r.t. the initialization, we only need to run the standard ODE solver twice -- one is forward in time that evolves a long trajectory of gradient flow for the sampled task; the other is backward and solves the adjoint ODE. We need not create or expand any intermediate computational graphs, adopt aggressive approximations, or impose proximal regularizers in the training loss. Our approach is cheap, accurate, and adaptable to different trajectory lengths. We demonstrate the advantage of our approach in both synthetic and real-world meta-learning tasks.
COMP-PHJun 25, 2021
Multifidelity Modeling for Physics-Informed Neural Networks (PINNs)Michael Penwarden, Shandian Zhe, Akil Narayan et al.
Multifidelity simulation methodologies are often used in an attempt to judiciously combine low-fidelity and high-fidelity simulation results in an accuracy-increasing, cost-saving way. Candidates for this approach are simulation methodologies for which there are fidelity differences connected with significant computational cost differences. Physics-informed Neural Networks (PINNs) are candidates for these types of approaches due to the significant difference in training times required when different fidelities (expressed in terms of architecture width and depth as well as optimization criteria) are employed. In this paper, we propose a particular multifidelity approach applied to PINNs that exploits low-rank structure. We demonstrate that width, depth, and optimization criteria can be used as parameters related to model fidelity, and show numerical justification of cost differences in training due to fidelity parameter choices. We test our multifidelity scheme on various canonical forward PDE models that have been presented in the emerging PINNs literature.
NAMar 29, 2021
A bandit-learning approach to multifidelity approximationYiming Xu, Vahid Keshavarzzadeh, Robert M. Kirby et al.
Multifidelity approximation is an important technique in scientific computation and simulation. In this paper, we introduce a bandit-learning approach for leveraging data of varying fidelities to achieve precise estimates of the parameters of interest. Under a linear model assumption, we formulate a multifidelity approximation as a modified stochastic bandit, and analyze the loss for a class of policies that uniformly explore each model before exploiting. Utilizing the estimated conditional mean-squared error, we propose a consistent algorithm, adaptive Explore-Then-Commit (AETC), and establish a corresponding trajectory-wise optimality result. These results are then extended to the case of vector-valued responses, where we demonstrate that the algorithm is efficient without the need to worry about estimating high-dimensional parameters. The main advantage of our approach is that we require neither hierarchical model structure nor \textit{a priori} knowledge of statistical information (e.g., correlations) about or between models. Instead, the AETC algorithm requires only knowledge of which model is a trusted high-fidelity model, along with (relative) computational cost estimates of querying each model. Numerical experiments are provided at the end to support our theoretical findings.
NAApr 13, 2020
Analysis of The Ratio of $\ell_1$ and $\ell_2$ Norms in Compressed SensingYiming Xu, Akil Narayan, Hoang Tran et al.
We first propose a novel criterion that guarantees that an $s$-sparse signal is the local minimizer of the $\ell_1/\ell_2$ objective; our criterion is interpretable and useful in practice. We also give the first uniform recovery condition using a geometric characterization of the null space of the measurement matrix, and show that this condition is easily satisfied for a class of random matrices. We also present analysis on the robustness of the procedure when noise pollutes data. Numerical experiments are provided that compare $\ell_1/\ell_2$ with some other popular non-convex methods in compressed sensing. Finally, we propose a novel initialization approach to accelerate the numerical optimization procedure. We call this initialization approach \emph{support selection}, and we demonstrate that it empirically improves the performance of existing $\ell_1/\ell_2$ algorithms.
NAApr 19, 2019
Model reduction for fractional elliptic problems using Kato's formulaHuy Dinh, Harbir Antil, Yanlai Chen et al.
We propose a novel numerical algorithm utilizing model reduction for computing solutions to stationary partial differential equations involving the spectral fractional Laplacian. Our approach utilizes a known characterization of the solution in terms of an integral of solutions to classical elliptic problems. We reformulate this integral into an expression whose continuous and discrete formulations are stable; the discrete formulations are stable independent of all discretization parameters. We subsequently apply the reduced basis method to accomplish model order reduction for the integrand. Our choice of quadrature in discretization of the integral is a global Gaussian quadrature rule that we observe is more efficient than previously proposed quadrature rules. Finally, the model reduction approach enables one to compute solutions to multi-query fractional Laplace problems with order of magnitude less cost than a traditional solver.
NASep 24, 2018
A robust error estimator and a residual-free error indicator for reduced basis methodsYanlai Chen, Jiahua Jiang, Akil Narayan
The Reduced Basis Method (RBM) is a rigorous model reduction approach for solving parametrized partial differential equations. It identifies a low-dimensional subspace for approximation of the parametric solution manifold that is embedded in high-dimensional space. A reduced order model is subsequently constructed in this subspace. RBM relies on residual-based error indicators or {\em a posteriori} error bounds to guide construction of the reduced solution subspace, to serve as a stopping criteria, and to certify the resulting surrogate solutions. Unfortunately, it is well-known that the standard algorithm for residual norm computation suffers from premature stagnation at the level of the square root of machine precision. In this paper, we develop two alternatives to the standard offline phase of reduced basis algorithms. First, we design a robust strategy for computation of residual error indicators that allows RBM algorithms to enrich the solution subspace with accuracy beyond root machine precision. Secondly, we propose a new error indicator based on the Lebesgue function in interpolation theory. This error indicator does not require computation of residual norms, and instead only requires the ability to compute the RBM solution. This residual-free indicator is rigorous in that it bounds the error committed by the RBM approximation, but up to an uncomputable multiplicative constant. Because of this, the residual-free indicator is effective in choosing snapshots during the offline RBM phase, but cannot currently be used to certify error that the approximation commits. However, it circumvents the need for \textit{a posteriori} analysis of numerical methods, and therefore can be effective on problems where such a rigorous estimate is hard to derive.
NAAug 1, 2018
Certified reduced basis methods for fractional Laplace equations via extensionHarbir Antil, Yanlai Chen, Akil Narayan
Fractional Laplace equations are becoming important tools for mathematical modeling and prediction. Recent years have shown much progress in developing accurate and robust algorithms to numerically solve such problems, yet most solvers for fractional problems are computationally expensive. Practitioners are often interested in choosing the fractional exponent of the mathematical model to match experimental and/or observational data; this requires the computational solution to the fractional equation for several values of the both exponent and other parameters that enter the model, which is a computationally expensive many-query problem. To address this difficulty, we present a model order reduction strategy for fractional Laplace problems utilizing the reduced basis method (RBM). Our RBM algorithm for this fractional partial differential equation (PDE) allows us to accomplish significant acceleration compared to a traditional PDE solver while maintaining accuracy. Our numerical results demonstrate this accuracy and efficiency of our RBM algorithm on fractional Laplace problems in two spatial dimensions.
NASep 12, 2017
Parametric/Stochastic Model Reduction: Low-Rank Representation, Non-Intrusive Bi-Fidelity Approximation, and Convergence AnalysisJerrad Hampton, Hillary Fairbanks, Akil Narayan et al.
For practical model-based demands, such as design space exploration and uncertainty quantification (UQ), a high-fidelity model that produces accurate outputs often has high computational cost, while a low-fidelity model with less accurate outputs has low computational cost. It is often possible to construct a bi-fidelity model having accuracy comparable with the high-fidelity model and computational cost comparable with the low-fidelity model. This work presents the construction and analysis of a non-intrusive (i.e., sample-based) bi-fidelity model that relies on the low-rank structure of the map between model parameters/uncertain inputs and the solution of interest, if exists. Specifically, we derive a novel, pragmatic estimate for the error committed by this bi-fidelity model. We show that this error bound can be used to determine if a given pair of low- and high-fidelity models will lead to an accurate bi-fidelity approximation. The cost of this error bound is relatively small and depends on the solution rank. The value of this error estimate is demonstrated using two example problems in the context of UQ, involving linear and non-linear partial differential equations.
NAAug 3, 2017
Weighted approximate Fekete points: Sampling for least-squares polynomial approximationLing Guo, Akil Narayan, Liang Yan et al.
We propose and analyze a weighted greedy scheme for computing deterministic sample configurations in multidimensional space for performing least-squares polynomial approximations on $L^2$ spaces weighted by a probability density function. Our procedure is a particular weighted version of the approximate Fekete points method, with the weight function chosen as the (inverse) Christoffel function. Our procedure has theoretical advantages: when linear systems with optimal condition number exist, the procedure finds them. In the one-dimensional setting with any density function, our greedy procedure almost always generates optimally-conditioned linear systems. Our method also has practical advantages: our procedure is impartial to compactness of the domain of approximation, and uses only pivoted linear algebraic routines. We show through numerous examples that our sampling design outperforms competing randomized and deterministic designs when the domain is both low and high dimensional.
NAJul 20, 2017
Sequential data assimilation with multiple nonlinear models and applications to subsurface flowLun Yang, Akil Narayan, Peng Wang
Complex systems are often described with competing models. Such divergence of interpretation on the system may stem from model fidelity, mathematical simplicity, and more generally, our limited knowledge of the underlying processes. Meanwhile, available but limited observations of system state could further complicates one's prediction choices. Over the years, data assimilation techniques, such as the Kalman filter, have become essential tools for improved system estimation by incorporating both models forecast and measurement; but its potential to mitigate the impacts of aforementioned model-form uncertainty has yet to be developed. Based on an earlier study of Multi-model Kalman filter, we propose a novel framework to assimilate multiple models with observation data for nonlinear systems, using extended Kalman filter, ensemble Kalman filter and particle filter, respectively. Through numerical examples of subsurface flow, we demonstrate that the new assimilation framework provides an effective and improved forecast of system behaviour.
NASep 16, 2016
A goal-oriented RBM-Accelerated generalized polynomial chaos algorithmJiahua Jiang, Yanlai Chen, Akil Narayan
The non-intrusive generalized Polynomial Chaos (gPC) method is a popular computational approach for solving partial differential equations (PDEs) with random inputs. The main hurdle preventing its efficient direct application for high-dimensional input parameters is that the size of many parametric sampling meshes grows exponentially in the number of inputs (the "curse of dimensionality"). In this paper, we design a weighted version of the reduced basis method (RBM) for use in the non-intrusive gPC framework. We construct an RBM surrogate that can rigorously achieve a user-prescribed error tolerance, and ultimately is used to more efficiently compute a gPC approximation non-intrusively. The algorithm is capable of speeding up traditional non-intrusive gPC methods by orders of magnitude without degrading accuracy, assuming that the solution manifold has low Kolmogorov width. Numerical experiments on our test problems show that the relative efficiency improves as the parametric dimension increases, demonstrating the potential of the method in delaying the curse of dimensionality. Theoretical results as well as numerical evidence justify these findings.
NANov 18, 2009
Deterministic Numerical Schemes for the Boltzmann EquationAkil Narayan, Andreas Klöckner
This article describes methods for the deterministic simulation of the collisional Boltzmann equation. It presumes that the transport and collision parts of the equation are to be simulated separately in the time domain. Time stepping schemes to achieve the splitting as well as numerical methods for each part of the operator are reviewed, with an emphasis on clearly exposing the challenges posed by the equation as well as their resolution by various schemes.