NAMay 19, 2013
Two-step greedy algorithm for reduced order quadraturesHarbir Antil, Scott E. Field, Frank Herrmann et al.
We present an algorithm to generate application-specific, global reduced order quadratures (ROQ) for multiple fast evaluations of weighted inner products between parameterized functions. If a reduced basis (RB) or any other projection-based model reduction technique is applied, the dimensionality of integrands is reduced dramatically; however, the cost of approximating the integrands by projection still scales as the size of the original problem. In contrast, using discrete empirical interpolation (DEIM) points as ROQ nodes leads to a computational cost which depends linearly on the dimension of the reduced space. Generation of a reduced basis via a greedy procedure requires a training set, which for products of functions can be very large. Since this direct approach can be impractical in many applications, we propose instead a two-step greedy targeted towards approximation of such products. We present numerical experiments demonstrating the accuracy and the efficiency of the two-step approach. The presented ROQ are expected to display very fast convergence whenever there is regularity with respect to parameter variation. We find that for the particular application here considered, one driven by gravitational wave physics, the two-step approach speeds up the offline computations to build the ROQ by more than two orders of magnitude. Furthermore, the resulting ROQ rule is found to converge exponentially with the number of nodes, and a factor of ~50 savings, without loss of accuracy, is observed in evaluations of inner products when ROQ are used as a downsampling strategy for equidistant samples using the trapezoidal rule. While the primary focus of this paper is on quadrature rules for inner products of parameterized functions, our method can be easily adapted to integrations of single parameterized functions, and some examples of this type are considered.
OCApr 23, 2017
Controlling the Kelvin Force: Basic Strategies and Applications to Magnetic Drug TargetingHarbir Antil, Ricardo H. Nochetto, Pablo Venegas
Motivated by problems arising in magnetic drug targeting, we propose to generate an almost constant Kelvin (magnetic) force in a target subdomain, moving along a prescribed trajectory. This is carried out by solving a minimization problem with a tracking type cost functional. The magnetic sources are assumed to be dipoles and the control variables are the magnetic field intensity, the source location and the magnetic field direction. The resulting magnetic field is shown to effectively steer the drug concentration, governed by a drift-diffusion PDE, from an initial to a desired location with limited spreading.
OCDec 23, 2016
Optimizing the Kelvin force in a moving target subdomainHarbir Antil, Ricardo H. Nochetto, Pablo Venegas
In order to generate a desired Kelvin (magnetic) force in a target subdomain moving along a prescribed trajectory, we propose a minimization problem with a tracking type cost functional. We use the so-called dipole approximation to realize the magnetic field, where the location and the direction of the magnetic sources are assumed to be fixed. The magnetic field intensity acts as the control and exhibits limiting pointwise constraints. We address two specific problems: the first one corresponds to a fixed final time whereas the second one deals with an unknown force to minimize the final time. We prove existence of solutions and deduce local uniqueness provided that a second order sufficient condition is valid. We use the classical backward Euler scheme for time discretization. For both problems we prove the $H^1$-weak convergence of this semi-discrete numerical scheme. This result is motivated by $Γ$-convergence and does not require second order sufficient condition. If the latter holds then we prove $H^1$-strong local convergence. We report computational results to assess the performance of the numerical methods. As an application, we study the control of magnetic nanoparticles as those used in magnetic drug delivery, where the optimized Kelvin force is used to transport the drug to a desired location.
CEFeb 13, 2019
Fractional Operators Applied to Geophysical ElectromagneticsChester J. Weiss, Bart G. van Bloemen Waanders, Harbir Antil
A growing body of applied mathematics literature in recent years has focussed on the application of fractional calculus to problems of anomalous transport. In these analyses, the anomalous transport (of charge, tracers, fluid, etc.) is presumed attributable to long-range correlations of material properties within an inherently complex, and in some cases self-similar, conducting medium. Rather than considering an exquisitely discretized (and computationally intractable) representation of the medium, the complex and spatially correlated heterogeneity is represented through reformulation of the PDE governing the relevant transport physics such that its coefficients are, instead, smooth but paired with fractional-order space derivatives. Here we apply these concepts to the scalar Helmholtz equation and its use in electromagnetic interrogation of Earth's interior through the magnetotelluric method. We outline a practical algorithm for solving the Helmholtz equation using spectral methods coupled with finite element discretization. Execution of this algorithm for the magnetotelluric problem reveals several interesting features observable in field data: long--range correlation of the predicted electromagnetic fields; a power-law relationship between the squared impedance amplitude and squared wavenumber whose slope is a function of the fractional exponent within the governing Helmholtz equation; and, a non-constant apparent resistivity spectrum whose variability arises solely from the fractional exponent. In geologic settings characterized by self--similarity (e.g. fracture systems; thick and richly-textured sedimentary sequences, etc.) we posit that diagnostics are useful for geologic characterization of features far below the typical resolution limit of electromagnetic methods in geophysics.
OCNov 26, 2018
External optimal control of nonlocal PDEsHarbir Antil, Ratna Khatri, Mahamadi Warma
Very recently M. Warma has shown that for nonlocal PDEs associated with the fractional Laplacian, the classical notion of controllability from the boundary does not make sense and therefore it must be replaced by a control that is localized outside the open set where the PDE is solved. Having learned from the above mentioned result, in this paper we introduce a new class of source identification and optimal control problems where the source/control is located outside the observation domain where the PDE is satisfied. The classical diffusion models lack this flexibility as they assume that the source/control is located either inside or on the boundary. This is essentially due to the locality property of the underlying operators. We use the nonlocality of the fractional operator to create a framework that now allows placing a source/control outside the observation domain. We consider the Dirichlet, Robin and Neumann source identification or optimal control problems. These problems require dealing with the nonlocal normal derivative (that we shall call interaction operator). We create a functional analytic framework and show well-posedness and derive the first order optimality conditions for these problems. We introduce a new approach to approximate, with convergence rate, the Dirichlet problem with nonzero exterior condition. The numerical examples confirm our theoretical findings and illustrate the practicality of our approach.
OCJan 14, 2019
Optimal control of fractional semilinear PDEsHarbir Antil, Mahamadi Warma
In this paper we consider the optimal control of semilinear fractional PDEs with both spectral and integral fractional diffusion operators of order $2s$ with $s \in (0,1)$. We first prove the boundedness of solutions to both semilinear fractional PDEs under minimal regularity assumptions on domain and data. We next introduce an optimal growth condition on the nonlinearity to show the Lipschitz continuity of the solution map for the semilinear elliptic equations with respect to the data. We further apply our ideas to show existence of solutions to optimal control problems with semilinear fractional equations as constraints. Under the standard assumptions on the nonlinearity (twice continuously differentiable) we derive the first and second order optimality conditions.
OCMar 27, 2018
Sobolev spaces with non-Muckenhoupt weights, fractional elliptic operators, and applicationsHarbir Antil, Carlos N. Rautenberg
We propose a new variational model in weighted Sobolev spaces with non-standard weights and applications to image processing. We show that these weights are, in general, not of Muckenhoupt type and therefore the classical analysis tools may not apply. For special cases of the weights, the resulting variational problem is known to be equivalent to the fractional Poisson problem. The trace space for the weighted Sobolev space is identified to be embedded in a weighted $L^2$ space. We propose a finite element scheme to solve the Euler-Lagrange equations, and for the image denoising application we propose an algorithm to identify the unknown weights. The approach is illustrated on several test problems and it yields better results when compared to the existing total variation techniques.
OCMay 31, 2019
Optimal Control of Fractional Elliptic PDEs with State Constraints and Characterization of the dual of Fractional Order Sobolev SpacesHarbir Antil, Deepanshu Verma, Mahamadi Warma
This paper introduces the notion of state constraints for optimal control problems governed by fractional elliptic PDEs of order $s \in (0,1)$. There are several mathematical tools that are developed during the process to study this problem, for instance, the characterization of the dual of the fractional order Sobolev spaces and well-posedness of fractional PDEs with measure-valued datum. These tools are widely applicable. We show well-posedness of the optimal control problem and derive the first order optimality conditions. Notice that the adjoint equation is a fractional PDE with measure as the right-hand-side datum. We use the characterization of the fractional order dual spaces to study the regularity of the state and adjoint equations. We emphasize that the classical case ($s=1$) was considered by E. Casas in \cite{ECasas_1986a} but almost none of the existing results are applicable to our fractional case.
NADec 28, 2016
Optimization with respect to order in a fractional diffusion model: analysis, approximation and algorithmic aspectsHarbir Antil, Enrique Otarola, Abner J. Salgado
We consider an identification problem, where the state $u$ is governed by a fractional elliptic equation and the unknown variable corresponds to the order $s \in (0,1)$ of the underlying operator. We study the existence of an optimal pair $(\bar s, \bar u)$ and provide sufficient conditions for its local uniqueness. We develop semi-discrete and fully discrete algorithms to approximate the solutions to our identification problem and provide a convergence analysis. We present numerical illustrations that confirm and extend our theory.
LGMar 15, 2022
NINNs: Nudging Induced Neural NetworksHarbir Antil, Rainald Löhner, Randy Price
New algorithms called nudging induced neural networks (NINNs), to control and improve the accuracy of deep neural networks (DNNs), are introduced. The NINNs framework can be applied to almost all pre-existing DNNs, with forward propagation, with costs comparable to existing DNNs. NINNs work by adding a feedback control term to the forward propagation of the network. The feedback term nudges the neural network towards a desired quantity of interest. NINNs offer multiple advantages, for instance, they lead to higher accuracy when compared with existing data assimilation algorithms such as nudging. Rigorous convergence analysis is established for NINNs. The algorithmic and theoretical findings are illustrated on examples from data assimilation and chemically reacting flows.
NANov 30, 2022
Neural Network Representation of Time IntegratorsRainald Löhner, Harbir Antil
Deep neural network (DNN) architectures are constructed that are the exact equivalent of explicit Runge-Kutta schemes for numerical time integration. The network weights and biases are given, i.e., no training is needed. In this way, the only task left for physics-based integrators is the DNN approximation of the right-hand side. This allows to clearly delineate the approximation estimates for right-hand side errors and time integration errors. The architecture required for the integration of a simple mass-damper-stiffness case is included as an example.
OCMay 4, 2016
Some applications of weighted norm inequalities to the error analysis of PDE constrained optimization problemsHarbir Antil, Enrique Otarola, Abner J. Salgado
The purpose of this work is to illustrate how the theory of Muckenhoupt weights, Muckenhoupt weighted Sobolev spaces and the corresponding weighted norm inequalities can be used in the analysis and discretization of PDE constrained optimization problems. We consider: a linear quadratic constrained optimization problem where the state solves a nonuniformly elliptic equation; a problem where the cost involves pointwise observations of the state and one where the state has singular sources, e.g. point masses. For all three examples we propose and analyze numerical schemes and provide error estimates in two and three dimensions. While some of these problems might have been considered before in the literature, our approach allows for a simpler, Hilbert space-based, analysis and discretization and further generalizations.
OCDec 19, 2017
Fractional Elliptic Quasi-Variational Inequalities: Theory and NumericsHarbir Antil, Carlos N. Rautenberg
This paper introduces an elliptic quasi-variational inequality (QVI) problem class with fractional diffusion of order $s \in (0,1)$, studies existence and uniqueness of solutions and develops a solution algorithm. As the fractional diffusion prohibits the use of standard tools to approximate the QVI, instead we realize it as a Dirichlet-to-Neumann map for a problem posed on a semi-infinite cylinder. We first study existence and uniqueness of solutions for this extended QVI and then transfer the results to the fractional QVI: This introduces a new paradigm in the field of fractional QVIs. Further, we truncate the semi-infinite cylinder and show that the solution to the truncated problem converges to the solution of the extended problem, under fairly mild assumptions, as the truncation parameter $τ$ tends to infinity. Since the constraint set changes with the solution, we develop an argument using Mosco convergence. We state an algorithm to solve the truncated problem and show its convergence in function space. Finally, we conclude with several illustrative numerical examples.
LGAug 23, 2023
On-Manifold Projected Gradient DescentAaron Mahler, Tyrus Berry, Tom Stephens et al.
This work provides a computable, direct, and mathematically rigorous approximation to the differential geometry of class manifolds for high-dimensional data, along with nonlinear projections from input space onto these class manifolds. The tools are applied to the setting of neural network image classifiers, where we generate novel, on-manifold data samples, and implement a projected gradient descent algorithm for on-manifold adversarial training. The susceptibility of neural networks (NNs) to adversarial attack highlights the brittle nature of NN decision boundaries in input space. Introducing adversarial examples during training has been shown to reduce the susceptibility of NNs to adversarial attack; however, it has also been shown to reduce the accuracy of the classifier if the examples are not valid examples for that class. Realistic "on-manifold" examples have been previously generated from class manifolds in the latent of an autoencoder. Our work explores these phenomena in a geometric and computational setting that is much closer to the raw, high-dimensional input space than can be provided by VAE or other black box dimensionality reductions. We employ conformally invariant diffusion maps (CIDM) to approximate class manifolds in diffusion coordinates, and develop the Nyström projection to project novel points onto class manifolds in this setting. On top of the manifold approximation, we leverage the spectral exterior calculus (SEC) to determine geometric quantities such as tangent vectors of the manifold. We use these tools to obtain adversarial examples that reside on a class manifold, yet fool a classifier. These misclassifications then become explainable in terms of human-understandable manipulations within the data, by expressing the on-manifold adversary in the semantic basis on the manifold.
DCMay 16, 2018
A Note on QR-Based Model Reduction: Algorithm, Software, and Gravitational Wave ApplicationsHarbir Antil, Dangxing Chen, Scott E. Field
While the proper orthogonal decomposition (POD) is optimal under certain norms it's also expensive to compute. For large matrix sizes, it is well known that the QR decomposition provides a tractable alternative. Under the assumption that it is rank--revealing QR (RRQR), the approximation error incurred is similar to the POD error and, furthermore, we show the existence of an RRQR with exactly same error estimate as POD. To numerically realize an RRQR decomposition, we will discuss the (iterative) modified Gram Schmidt with pivoting (MGS) and reduced basis method by employing a greedy strategy. We show that these two, seemingly different approaches from linear algebra and approximation theory communities are in fact equivalent. Finally, we describe an MPI/OpenMP parallel code that implements one of the QR-based model reduction algorithms we analyze. This code was developed with model reduction in mind, and includes functionality for tasks that go beyond what is required for standard QR decompositions. We document the code's scalability and show it to be capable of tackling large problems. In particular, we apply our code to a model reduction problem motivated by gravitational waves emitted from binary black hole mergers and demonstrate excellent weak scalability on the supercomputer Blue Waters up to 32,768 cores and for complex, dense matrices as large as 10,000-by-3,276,800 (about half a terabyte in size).
OCJan 18, 2017
Optimal control of the coefficient for fractional and regional fractional {$p$}-{L}aplace equations: Approximation and convergenceHarbir Antil, Mahamadi Warma
In this paper we study optimal control problems with either fractional or regional fractional $p$-Laplace equation, of order $s$ and $p\in [2,\infty)$, as constraints over a bounded open set with Lipschitz continuous boundary. The control, which fulfills the pointwise box constraints, is given by the coefficient of the involved operator. To overcome the degeneracy of both fractional $p$-Laplacians, we introduce a regularization for both operators. We show existence and uniqueness of solution to the regularized state equations and existence of solution to the regularized optimal control problems. We also prove several auxiliary results for the regularized problems which are of independent interest. We conclude with the convergence of the regularized solutions.
OCMar 29, 2016
An a posteriori error analysis for an optimal control problem involving the fractional LaplacianHarbir Antil, Enrique Otarola
In a previous work, we introduced a discretization scheme for a constrained optimal control problem involving the fractional Laplacian. For such a control problem, we derived optimal a priori error estimates that demand the convexity of the domain and some compatibility conditions on the data. To relax such restrictions, in this paper, we introduce and analyze an efficient and, under certain assumptions, reliable a posteriori error estimator. We realize the fractional Laplacian as the Dirichlet-to-Neumann map for a nonuniformly elliptic problem posed on a semi--infinite cylinder in one more spatial dimension. This extra dimension further motivates the design of an posteriori error indicator. The latter is defined as the sum of three contributions, which come from the discretization of the state and adjoint equations and the control variable. The indicator for the state and adjoint equations relies on an anisotropic error estimator in Muckenhoupt weighted Sobolev spaces. The analysis is valid in any dimension. On the basis of the devised a posteriori error estimator, we design a simple adaptive strategy that exhibits optimal experimental rates of convergence.
APDec 25, 2015
The Stokes problem with Navier slip boundary condition: Minimal fractional Sobolev regularity of the domainHarbir Antil, Ricardo H. Nochetto, Patrick Sodre
We prove well-posedness in reflexive Sobolev spaces of weak solutions to the stationary Stokes problem with Navier slip boundary condition over bounded domains $Ω$ of $\mathbb{R}^n$ of class $W^{2-1/s}_s$, $s>n$. Since such domains are of class $C^{1,1-n/s}$, our result improves upon the recent one by Amrouche-Seloula, who assume $Ω$ to be of class $C^{1,1}$. We deal with the slip boundary condition directly via a new localization technique, which features domain, space and operator decompositions. To flatten the boundary of $Ω$ locally, we construct a novel $W^2_s$ diffeomorphism for $Ω$ of class $W^{2-1/s}_s$. The fractional regularity gain, from $2-1/s$ to $2$, guarantees that the Piola transform is of class $W^1_s$. This allows us to transform $W^1_r$ vector fields without changing their regularity, provided $r\le s$, and preserve the unit normal which is Hölder. It is in this sense that the boundary regularity $W^{2-1/s}_s$ seems to be minimal.
OCApr 18, 2022
An Optimal Time Variable Learning Framework for Deep Neural NetworksHarbir Antil, Hugo Díaz, Evelyn Herberg
Feature propagation in Deep Neural Networks (DNNs) can be associated to nonlinear discrete dynamical systems. The novelty, in this paper, lies in letting the discretization parameter (time step-size) vary from layer to layer, which needs to be learned, in an optimization framework. The proposed framework can be applied to any of the existing networks such as ResNet, DenseNet or Fractional-DNN. This framework is shown to help overcome the vanishing and exploding gradient issues. Stability of some of the existing continuous DNNs such as Fractional-DNN is also studied. The proposed approach is applied to an ill-posed 3D-Maxwell's equation.
OPTICSMay 22
Polarization-Induced Beam Bending: Mathematical Model, Discretization, and AlgorithmHarbir Antil, Rainald Löhner, Sarswati Shah
We study a reduced hydrodynamic formulation of paraxial vector beam propagation in which the beam intensity, optical phase, and spatially-dependent polarization are coupled through a nonlinear dispersive system. While prior analytical work derived a solution for the beam path valid for short propagation distances, a fully resolved numerical treatment of the model over long ranges has not previously been available. Here we present a conservative numerical scheme for the coupled system, combining a finite-volume discretization of the intensity equation with monotone Hamilton--Jacobi (H-J) solvers for the phase dynamics and upwind transport of polarization. The method preserves the nonnegativity of the intensity and remains stable under long-distance propagation. We perform large-scale simulations over propagation distances of tens of meters, while resolving millimeter-scale transverse structure. The numerical results reproduce the analytically predicted and experimentally observed quadratic beam bending at short distances and reveal systematic deviations beyond the asymptotic regime. These deviations arise from nonlinear phase accumulation and dispersive effects captured by the full model but are neglected in the short-distance approximation.
CVJul 5, 2023
GNEP Based Dynamic Segmentation and Motion Estimation for Neuromorphic ImagingHarbir Antil, David Sayre
This paper explores the application of event-based cameras in the domains of image segmentation and motion estimation. These cameras offer a groundbreaking technology by capturing visual information as a continuous stream of asynchronous events, departing from the conventional frame-based image acquisition. We introduce a Generalized Nash Equilibrium based framework that leverages the temporal and spatial information derived from the event stream to carry out segmentation and velocity estimation. To establish the theoretical foundations, we derive an existence criteria and propose a multi-level optimization method for calculating equilibrium. The efficacy of this approach is shown through a series of experiments.
OCNov 5, 2025
Optimal Boundary Control of Diffusion on Graphs via Linear ProgrammingHarbir Antil, Rainald Löhner, Felipe Pérez
We propose a linear programming (LP) framework for steady-state diffusion and flux optimization on geometric networks. The state variable satisfies a discrete diffusion law on a weighted, oriented graph, where conductances are scaled by edge lengths to preserve geometric fidelity. Boundary potentials act as controls that drive interior fluxes according to a linear network Laplacian. The optimization problem enforces physically meaningful sign and flux-cap constraints at all boundary edges, derived directly from a gradient bound. This yields a finite-dimensional LP whose feasible set is polyhedral, and whose boundedness and solvability follow from simple geometric or algebraic conditions on the network data. We prove that under the absence of negative recession directions--automatically satisfied in the presence of finite box bounds, flux caps, or sign restrictions--the LP admits a global minimizer. Several sufficient conditions guaranteeing boundedness of the feasible region are identified, covering both full-rank and rank-deficient flux maps. The analysis connects classical results such as the Minkowski--Weyl decomposition, Hoffman's bound, and the fundamental theorem of linear programming with modern network-based diffusion modeling. Two large-scale examples illustrate the framework: (i) A typical large stadium in a major modern city, which forms a single connected component with relatively uniform corridor widths, and a (ii) A complex street network emanating from a large, historical city center, which forms a multi-component system.
CVAug 28, 2024
Dynamic Reconstruction from Neuromorphic DataHarbir Antil, Daniel Blauvelt, David Sayre
Unlike traditional cameras which synchronously register pixel intensity, neuromorphic sensors only register `changes' at pixels where a change is occurring asynchronously. This enables neuromorphic sensors to sample at a micro-second level and efficiently capture the dynamics. Since, only sequences of asynchronous event changes are recorded rather than brightness intensities over time, many traditional image processing techniques cannot be directly applied. Furthermore, existing approaches, including the ones recently introduced by the authors, use traditional images combined with neuromorphic event data to carry out reconstructions. The aim of this work is introduce an optimization based approach to reconstruct images and dynamics only from the neuromoprhic event data without any additional knowledge of the events. Each pixel is modeled temporally. The experimental results on real data highlight the efficacy of the presented approach, paving the way for efficient and accurate processing of neuromorphic sensor data in real-world applications.
OCApr 30
Structure-Preserving Optimal Control of Maxwell's Equations with Applications to Source CloakingHarbir Antil, Yaw Owusu-Agyemang, Rohit Khandelwal et al.
We develop a structure-preserving solution framework for the optimal control of the time-dependent Maxwell's equations. Building on a well-posedness theory for a weak form of the forward problem, we first analyze a forward solver that couples Nédélec and Raviart--Thomas finite elements with Crank--Nicolson time stepping. The solver preserves the de~Rham structure, enforces a discrete Gauss law, exactly satisfies a per-time-step energy balance, and converges to the weak solution under low regularity assumptions on the problem data, which are dictated by the optimal control setting. To control the Maxwell system, we add the curl of a space-time current density as a source to Ampére's law. The curl form yields charge conservation without auxiliary constraints. We prove the well-posedness and continuity of the control-to-state map, derive the adjoint system and a gradient representation for a tracking-type objective functional, and formulate a discrete optimization scheme that inherits structure preservation from the forward solver. Our discrete stationarity conditions are consistent with their continuous counterparts, and the discrete optimal controls converge, with mesh and time refinements, to the continuous optima. We demonstrate the merits of our optimal control formulation and the theoretical developments by numerically solving a series of source-cloaking model problems.
LGOct 1, 2025
Randomized Matrix Sketching for Neural Network Training and Gradient MonitoringHarbir Antil, Deepanshu Verma
Neural network training relies on gradient computation through backpropagation, yet memory requirements for storing layer activations present significant scalability challenges. We present the first adaptation of control-theoretic matrix sketching to neural network layer activations, enabling memory-efficient gradient reconstruction in backpropagation. This work builds on recent matrix sketching frameworks for dynamic optimization problems, where similar state trajectory storage challenges motivate sketching techniques. Our approach sketches layer activations using three complementary sketch matrices maintained through exponential moving averages (EMA) with adaptive rank adjustment, automatically balancing memory efficiency against approximation quality. Empirical evaluation on MNIST, CIFAR-10, and physics-informed neural networks demonstrates a controllable accuracy-memory tradeoff. We demonstrate a gradient monitoring application on MNIST showing how sketched activations enable real-time gradient norm tracking with minimal memory overhead. These results establish that sketched activation storage provides a viable path toward memory-efficient neural network training and analysis.
OCSep 9, 2025
OCTANE -- Optimal Control for Tensor-based Autoencoder Network Emergence: Explicit CaseRatna Khatri, Anthony Kolshorn, Colin Olson et al.
This paper presents a novel, mathematically rigorous framework for autoencoder-type deep neural networks that combines optimal control theory and low-rank tensor methods to yield memory-efficient training and automated architecture discovery. The learning task is formulated as an optimization problem constrained by differential equations representing the encoder and decoder components of the network and the corresponding optimality conditions are derived via a Lagrangian approach. Efficient memory compression is enabled by approximating differential equation solutions on low-rank tensor manifolds using an adaptive explicit integration scheme. These concepts are combined to form OCTANE (Optimal Control for Tensor-based Autoencoder Network Emergence) -- a unified training framework that yields compact autoencoder architectures, reduces memory usage, and enables effective learning, even with limited training data. The framework's utility is illustrated with application to image denoising and deblurring tasks and recommendations regarding governing hyperparameters are provided.
LGMay 16, 2023
A Note on Dimensionality Reduction in Deep Neural Networks using Empirical Interpolation MethodHarbir Antil, Madhu Gupta, Randy Price
Empirical interpolation method (EIM) is a well-known technique to efficiently approximate parameterized functions. This paper proposes to use EIM algorithm to efficiently reduce the dimension of the training data within supervised machine learning. This is termed as DNN-EIM. Applications in data science (e.g., MNIST) and parameterized (and time-dependent) partial differential equations (PDEs) are considered. The proposed DNNs in case of classification are trained in parallel for each class. This approach is sequential, i.e., new classes can be added without having to retrain the network. In case of PDEs, a DNN is designed corresponding to each EIM point. Again, these networks can be trained in parallel, for each EIM point. In all cases, the parallel networks require fewer than ten times the number of training weights. Significant gains are observed in terms of training times, without sacrificing accuracy.
LGApr 1, 2021
Novel DNNs for Stiff ODEs with Applications to Chemically Reacting FlowsThomas S. Brown, Harbir Antil, Rainald Löhner et al.
Chemically reacting flows are common in engineering, such as hypersonic flow, combustion, explosions, manufacturing processes and environmental assessments. For combustion, the number of reactions can be significant (over 100) and due to the very large CPU requirements of chemical reactions (over 99%) a large number of flow and combustion problems are presently beyond the capabilities of even the largest supercomputers. Motivated by this, novel Deep Neural Networks (DNNs) are introduced to approximate stiff ODEs. Two approaches are compared, i.e., either learn the solution or the derivative of the solution to these ODEs. These DNNs are applied to multiple species and reactions common in chemically reacting flows. Experimental results show that it is helpful to account for the physical properties of species while designing DNNs. The proposed approach is shown to generalize well.
NAFeb 8, 2021
Novel Deep neural networks for solving Bayesian statistical inverseHarbir Antil, Howard C Elman, Akwum Onwunta et al.
We consider the simulation of Bayesian statistical inverse problems governed by large-scale linear and nonlinear partial differential equations (PDEs). Markov chain Monte Carlo (MCMC) algorithms are standard techniques to solve such problems. However, MCMC techniques are computationally challenging as they require several thousands of forward PDE solves. The goal of this paper is to introduce a fractional deep neural network based approach for the forward solves within an MCMC routine. Moreover, we discuss some approximation error estimates and illustrate the efficiency of our approach via several numerical examples.
OCApr 1, 2020
Fractional Deep Neural Network via Constrained OptimizationHarbir Antil, Ratna Khatri, Rainald Löhner et al.
This paper introduces a novel algorithmic framework for a deep neural network (DNN), which in a mathematically rigorous manner, allows us to incorporate history (or memory) into the network -- it ensures all layers are connected to one another. This DNN, called Fractional-DNN, can be viewed as a time-discretization of a fractional in time nonlinear ordinary differential equation (ODE). The learning problem then is a minimization problem subject to that fractional ODE as constraints. We emphasize that an analogy between the existing DNN and ODEs, with standard time derivative, is well-known by now. The focus of our work is the Fractional-DNN. Using the Lagrangian approach, we provide a derivation of the backward propagation and the design equations. We test our network on several datasets for classification problems. Fractional-DNN offers various advantages over the existing DNN. The key benefits are a significant improvement to the vanishing gradient issue due to the memory effect, and better handling of nonsmooth data due to the network's ability to approximate non-smooth functions.
IVJul 22, 2019
Bilevel Optimization, Deep Learning and Fractional Laplacian Regularization with Applications in TomographyHarbir Antil, Zichao Di, Ratna Khatri
In this work we consider a generalized bilevel optimization framework for solving inverse problems. We introduce fractional Laplacian as a regularizer to improve the reconstruction quality, and compare it with the total variation regularization. We emphasize that the key advantage of using fractional Laplacian as a regularizer is that it leads to a linear operator, as opposed to the total variation regularization which results in a nonlinear degenerate operator. Inspired by residual neural networks, to learn the optimal strength of regularization and the exponent of fractional Laplacian, we develop a dedicated bilevel optimization neural network with a variable depth for a general regularized inverse problem. We also draw some parallels between an activation function in a neural network and regularization. We illustrate how to incorporate various regularizer choices into our proposed network. As an example, we consider tomographic reconstruction as a model problem and show an improvement in reconstruction quality, especially for limited data, via fractional Laplacian regularization. We successfully learn the regularization strength and the fractional exponent via our proposed bilevel optimization neural network. We observe that the fractional Laplacian regularization outperforms total variation regularization. This is specially encouraging, and important, in the case of limited and noisy data.
COJul 2, 2019
Adaptive particle-based approximations of the Gibbs posterior for inverse problemsZilong Zou, Sayan Mukherjee, Harbir Antil et al.
In this work, we adopt a general framework based on the Gibbs posterior to update belief distributions for inverse problems governed by partial differential equations (PDEs). The Gibbs posterior formulation is a generalization of standard Bayesian inference that only relies on a loss function connecting the unknown parameters to the data. It is particularly useful when the true data generating mechanism (or noise distribution) is unknown or difficult to specify. The Gibbs posterior coincides with Bayesian updating when a true likelihood function is known and the loss function corresponds to the negative log-likelihood, yet provides subjective inference in more general settings. We employ a sequential Monte Carlo (SMC) approach to approximate the Gibbs posterior using particles. To manage the computational cost of propagating increasing numbers of particles through the loss function, we employ a recently developed local reduced basis method to build an efficient surrogate loss function that is used in the Gibbs update formula in place of the true loss. We derive error bounds for our approximation and propose an adaptive approach to construct the surrogate model in an efficient manner. We demonstrate the efficiency of our approach through several numerical examples.
NAApr 19, 2019
Model reduction for fractional elliptic problems using Kato's formulaHuy Dinh, Harbir Antil, Yanlai Chen et al.
We propose a novel numerical algorithm utilizing model reduction for computing solutions to stationary partial differential equations involving the spectral fractional Laplacian. Our approach utilizes a known characterization of the solution in terms of an integral of solutions to classical elliptic problems. We reformulate this integral into an expression whose continuous and discrete formulations are stable; the discrete formulations are stable independent of all discretization parameters. We subsequently apply the reduced basis method to accomplish model order reduction for the integrand. Our choice of quadrature in discretization of the integral is a global Gaussian quadrature rule that we observe is more efficient than previously proposed quadrature rules. Finally, the model reduction approach enables one to compute solutions to multi-query fractional Laplace problems with order of magnitude less cost than a traditional solver.
OCApr 11, 2019
External optimal control of fractional parabolic PDEsHarbir Antil, Deepanshu Verma, Mahamadi Warma
In this paper we introduce a new notion of optimal control, or source identification in inverse, problems with fractional parabolic PDEs as constraints. This new notion allows a source/control placement outside the domain where the PDE is fulfilled. We tackle the Dirichlet, the Neumann and the Robin cases. For the fractional elliptic PDEs this has been recently investigated by the authors in \cite{HAntil_RKhatri_MWarma_2018a}. The need for these novel optimal control concepts stems from the fact that the classical PDE models only allow placing the source/control either on the boundary or in the interior where the PDE is satisfied. However, the nonlocal behavior of the fractional operator now allows placing the control in the exterior. We introduce the notions of weak and very-weak solutions to the parabolic Dirichlet problem. We present an approach on how to approximate the parabolic Dirichlet solutions by the parabolic Robin solutions (with convergence rates). A complete analysis for the Dirichlet and Robin optimal control problems has been discussed. The numerical examples confirm our theoretical findings and further illustrate the potential benefits of nonlocal models over the local ones.
NAAug 1, 2018
Certified reduced basis methods for fractional Laplace equations via extensionHarbir Antil, Yanlai Chen, Akil Narayan
Fractional Laplace equations are becoming important tools for mathematical modeling and prediction. Recent years have shown much progress in developing accurate and robust algorithms to numerically solve such problems, yet most solvers for fractional problems are computationally expensive. Practitioners are often interested in choosing the fractional exponent of the mathematical model to match experimental and/or observational data; this requires the computational solution to the fractional equation for several values of the both exponent and other parameters that enter the model, which is a computationally expensive many-query problem. To address this difficulty, we present a model order reduction strategy for fractional Laplace problems utilizing the reduced basis method (RBM). Our RBM algorithm for this fractional partial differential equation (PDE) allows us to accomplish significant acceleration compared to a traditional PDE solver while maintaining accuracy. Our numerical results demonstrate this accuracy and efficiency of our RBM algorithm on fractional Laplace problems in two spatial dimensions.
NASep 11, 2017
Fractional Operators with Inhomogeneous Boundary Conditions: Analysis, Control, and DiscretizationHarbir Antil, Johannes Pfefferer, Sergejs Rogovs
In this paper we introduce new characterizations of spectral fractional Laplacian to incorporate nonhomogeneous Dirichlet and Neumann boundary conditions. The classical cases with homogeneous boundary conditions arise as a special case. We apply our definition to fractional elliptic equations of order $s \in (0,1)$ with nonzero Dirichlet and Neumann boundary condition. Here the domain $Ω$ is assumed to be a bounded, quasi-convex Lipschitz domain. To impose the nonzero boundary conditions, we construct fractional harmonic extensions of the boundary data. It is shown that solving for the fractional harmonic extension is equivalent to solving for the standard harmonic extension in the very-weak form. The latter result is of independent interest as well. The remaining fractional elliptic problem (with homogeneous boundary data) can be realized using the existing techniques. We introduce finite element discretizations and derive discretization error estimates in natural norms, which are confirmed by numerical experiments. We also apply our characterizations to Dirichlet and Neumann boundary optimal control problems with fractional elliptic equation as constraints.
NAAug 23, 2017
Spectral approximation of fractional PDEs in image processing and phase field modelingHarbir Antil, Sören Bartels
Fractional differential operators provide an attractive mathematical tool to model effects with limited regularity properties. Particular examples are image processing and phase field models in which jumps across lower dimensional subsets and sharp transitions across interfaces are of interest. The numerical solution of corresponding model problems via a spectral method is analyzed. Its efficiency and features of the model problems are illustrated by numerical experiments.
NAAug 16, 2016
Galerkin v. least-squares Petrov--Galerkin projection in nonlinear model reductionKevin Carlberg, Matthew Barone, Harbir Antil
Least-squares Petrov--Galerkin (LSPG) model-reduction techniques such as the Gauss--Newton with Approximated Tensors (GNAT) method have shown promise, as they have generated stable, accurate solutions for large-scale turbulent, compressible flow problems where standard Galerkin techniques have failed. However, there has been limited comparative analysis of the two approaches. This is due in part to difficulties arising from the fact that Galerkin techniques perform optimal projection associated with residual minimization at the time-continuous level, while LSPG techniques do so at the time-discrete level. This work provides a detailed theoretical and computational comparison of the two techniques for two common classes of time integrators: linear multistep schemes and Runge--Kutta schemes. We present a number of new findings, including conditions under which the LSPG ROM has a time-continuous representation, conditions under which the two techniques are equivalent, and time-discrete error bounds for the two approaches. Perhaps most surprisingly, we demonstrate both theoretically and computationally that decreasing the time step does not necessarily decrease the error for the LSPG ROM; instead, the time step should be `matched' to the spectral content of the reduced basis. In numerical experiments carried out on a turbulent compressible-flow problem with over one million unknowns, we show that increasing the time step to an intermediate value decreases both the error and the simulation time of the LSPG reduced-order model by an order of magnitude.
APJul 21, 2016
A note on semilinear fractional elliptic equation: analysis and discretizationHarbir Antil, Johannes Pfefferer, Mahamadi Warma
In this paper we study existence, regularity, and approximation of solution to a fractional semilinear elliptic equation of order $s \in (0,1)$. We identify minimal conditions on the nonlinear term and the source which leads to existence of weak solutions and uniform $L^\infty$-bound on the solutions. Next we realize the fractional Laplacian as a Dirichlet-to-Neumann map via the Caffarelli-Silvestre extension. We introduce a first-degree tensor product finite elements space to approximate the truncated problem. We derive a priori error estimates and conclude with an illustrative numerical example.