NAFeb 17, 2014
A Flexible Krylov Solver for Shifted Systems with Application to Oscillatory Hydraulic TomographyArvind K. Saibaba, Tania Bakhos, Peter K. Kitanidis
We discuss efficient solutions to systems of shifted linear systems arising in computations for oscillatory hydraulic tomography (OHT). The reconstruction of hydrogeological parameters such as hydraulic conductivity and specific storage using limited discrete measurements of pressure (head) obtained from sequential oscillatory pumping tests, leads to a nonlinear inverse problem. We tackle this using the quasi-linear geostatistical approach \cite{kitanidis1995quasi}. This method requires repeated solution of the forward (and adjoint) problem for multiple frequencies, for which we use flexible preconditioned Krylov subspace solvers specifically designed for shifted systems based on ideas in \cite{gu2007flexible}. The solvers allow the preconditioner to change at each iteration. We analyze the convergence of the solver and perform an error analysis when an iterative solver is used for inverting the preconditioner matrices. Finally, we apply our algorithm to a challenging application taken from oscillatory hydraulic tomography to demonstrate the computational gains by using the resulting method.
NAMar 29, 2016
Multipreconditioned GMRES for Shifted SystemsTania Bakhos, Peter Kitanidis, Scott Ladenheim et al.
An implementation of GMRES with multiple preconditioners (MPGMRES) is proposed for solving shifted linear systems with shift-and-invert preconditioners. With this type of preconditioner, the Krylov subspace can be built without requiring the matrix-vector product with the shifted matrix. Furthermore, the multipreconditioned search space is shown to grow only linearly with the number of preconditioners. This allows for a more efficient implementation of the algorithm. The proposed implementation is tested on shifted systems that arise in computational hydrology and the evaluation of different matrix functions. The numerical results indicate the effectiveness of the proposed approach.
NAFeb 15, 2017
Randomized Matrix-free Trace and Log-Determinant EstimatorsArvind K. Saibaba, Alen Alexanderian, Ilse C. F. Ipsen
We present randomized algorithms for estimating the trace and deter- minant of Hermitian positive semi-definite matrices. The algorithms are based on subspace iteration, and access the matrix only through matrix vector products. We analyse the error due to randomization, for starting guesses whose elements are Gaussian or Rademacher random variables. The analysis is cleanly separated into a structural (deterministic) part followed by a probabilistic part. Our absolute bounds for the expectation and concentration of the estimators are non-asymptotic and informative even for matrices of low dimension. For the trace estimators, we also present asymptotic bounds on the number of samples (columns of the starting guess) required to achieve a user-specified relative error. Numerical experiments illustrate the performance of the estimators and the tightness of the bounds on low-dimensional matrices; and on a challenging application in uncertainty quantification arising from Bayesian optimal experimental design.
CEJun 11, 2018
Goal-Oriented Optimal Design of Experiments for Large-Scale Bayesian Linear Inverse ProblemsAhmed Attia, Alen Alexanderian, Arvind K. Saibaba
We develop a framework for goal-oriented optimal design of experiments (GOODE) for large-scale Bayesian linear inverse problems governed by PDEs. This framework differs from classical Bayesian optimal design of experiments (ODE) in the following sense: we seek experimental designs that minimize the posterior uncertainty in the experiment end-goal, e.g., a quantity of interest (QoI), rather than the estimated parameter itself. This is suitable for scenarios in which the solution of an inverse problem is an intermediate step and the estimated parameter is then used to compute a QoI. In such problems, a GOODE approach has two benefits: the designs can avoid wastage of experimental resources by a targeted collection of data, and the resulting design criteria are computationally easier to evaluate due to the often low-dimensionality of the QoIs. We present two modified design criteria, A-GOODE and D-GOODE, which are natural analogues of classical Bayesian A- and D-optimal criteria. We analyze the connections to other ODE criteria, and provide interpretations for the GOODE criteria by using tools from information theory. Then, we develop an efficient gradient-based optimization framework for solving the GOODE optimization problems. Additionally, we present comprehensive numerical experiments testing the various aspects of the presented approach. The driving application is the optimal placement of sensors to identify the source of contaminants in a diffusion and transport problem. We enforce sparsity of the sensor placements using an $\ell_1$-norm penalty approach, and propose a practical strategy for specifying the associated penalty parameter.
NANov 16, 2017
Efficient D-optimal design of experiments for infinite-dimensional Bayesian linear inverse problemsAlen Alexanderian, Arvind K. Saibaba
We develop a computational framework for D-optimal experimental design for PDE-based Bayesian linear inverse problems with infinite-dimensional parameters. We follow a formulation of the experimental design problem that remains valid in the infinite-dimensional limit. The optimal design is obtained by solving an optimization problem that involves repeated evaluation of the log-determinant of high-dimensional operators along with their derivatives. Forming and manipulating these operators is computationally prohibitive for large-scale problems. Our methods exploit the low-rank structure in the inverse problem in three different ways, yielding efficient algorithms. Our main approach is to use randomized estimators for computing the D-optimal criterion, its derivative, as well as the Kullback--Leibler divergence from posterior to prior. Two other alternatives are proposed based on a low-rank approximation of the prior-preconditioned data misfit Hessian, and a fixed low-rank approximation of the prior-preconditioned forward operator. Detailed error analysis is provided for each of the methods, and their effectiveness is demonstrated on a model sensor placement problem for initial state reconstruction in a time-dependent advection-diffusion equation in two space dimensions.
NAFeb 13, 2018
The Discrete Empirical Interpolation Method: Canonical Structure and Formulation in Weighted Inner Product SpacesZlatko Drmač, Arvind K. Saibaba
New contributions are offered to the theory and practice of the Discrete Empirical Interpolation Method (DEIM). These include a detailed characterization of the canonical structure; a substantial tightening of the error bound for the DEIM oblique projection, based on index selection via a strong rank revealing QR factorization; and an extension of the DEIM approximation to weighted inner products defined by a real symmetric positive-definite matrix $W$. The weighted DEIM ($W$-DEIM) can be deployed in the more general framework where the POD Galerkin projection is formulated in a discretization of a suitable energy inner product such that the Galerkin projection preserves important physical properties such as e.g. stability. Also, a special case of $W$-DEIM is introduced, which is DGEIM, a discrete version of the Generalized Empirical Interpolation Method that allows generalization of the interpolation via a dictionary of linear functionals.
NANov 11, 2018
Randomized subspace iteration: Analysis of canonical angles and unitarily invariant normsArvind K. Saibaba
This paper is concerned with the analysis of the randomized subspace iteration for the computation of low-rank approximations. We present three different kinds of bounds. First, we derive both bounds for the canonical angles between the exact and the approximate singular subspaces. Second, we derive bounds for the low-rank approximation in any unitarily invariant norm (including the Schatten-p norm). This generalizes the bounds for Spectral and Frobenius norms found in the literature. Third, we present bounds for the accuracy of the singular values. The bounds are structural in that they are applicable to any starting guess, be it random or deterministic, that satisfies some minimal assumptions. Specialized bounds are provided when a Gaussian random matrix is used as the starting guess. Numerical experiments demonstrate the effectiveness of the proposed bounds.
COJun 6, 2019
Efficient Marginalization-based MCMC Methods for Hierarchical Bayesian Inverse ProblemsArvind K. Saibaba, Johnathan Bardsley, D. Andrew Brown et al.
Hierarchical models in Bayesian inverse problems are characterized by an assumed prior probability distribution for the unknown state and measurement error precision, and hyper-priors for the prior parameters. Combining these probability models using Bayes' law often yields a posterior distribution that cannot be sampled from directly, even for a linear model with Gaussian measurement error and Gaussian prior. Gibbs sampling can be used to sample from the posterior, but problems arise when the dimension of the state is large. This is because the Gaussian sample required for each iteration can be prohibitively expensive to compute, and because the statistical efficiency of the Markov chain degrades as the dimension of the state increases. The latter problem can be mitigated using marginalization-based techniques, but these can be computationally prohibitive as well. In this paper, we combine the low-rank techniques of Brown, Saibaba, and Vallelian (2018) with the marginalization approach of Rue and Held (2005). We consider two variants of this approach: delayed acceptance and pseudo-marginalization. We provide a detailed analysis of the acceptance rates and computational costs associated with our proposed algorithms, and compare their performances on two numerical test cases---image deblurring and inverse heat equation.
NAApr 10
Many (most?) column subset selection criteria are NP hard for a few columnsIlse C. F. Ipsen, Arvind K. Saibaba
We consider a variety of criteria for selecting k representative columns from a real mxn matrix A, when sufficiently few columns are required, i.e., 1<= k<= min{rank(A), m/3}. The criteria include the following optimization problems: absolute volume and S-optimality maximization; norm, pseudo-inverse norm, and condition minimization number in the two-norm, Frobenius norm and Schatten p-norms for p>2; stable rank maximization; and the new criterion of relative volume maximization, which is inversely proportional to a power of the condition number. We show that these criteria are NP hard and many do not admit polynomial time approximation schemes (PTAS). To formulate the optimization problems as decision problems, we derive optimal values for the subset selection criteria, as well as expressions for partitioned pseudo-inverses. The results for minimization of the pseudo-inverse in the Frobenius norm are applicable to trace optimization in A-optimal design.
NAMay 21, 2019
Uncertainty quantification in large Bayesian linear inverse problems using Krylov subspace methodsArvind K. Saibaba, Julianne Chung, Katrina Petroske
For linear inverse problems with a large number of unknown parameters, uncertainty quantification remains a challenging task. In this work, we use Krylov subspace methods to approximate the posterior covariance matrix and describe efficient methods for exploring the posterior distribution. Assuming that Krylov methods (e.g., based on the generalized Golub-Kahan bidiagonalization) have been used to compute an estimate of the solution, we get an approximation of the posterior covariance matrix for `free.' We provide theoretical results that quantify the accuracy of the approximation and of the resulting posterior distribution. Then, we describe efficient methods that use the approximation to compute measures of uncertainty, including the Kullback-Liebler divergence. We present two methods that use preconditioned Lanczos methods to efficiently generate samples from the posterior distribution. Numerical examples from tomography demonstrate the effectiveness of the described approaches.
NAApr 1
Improved Analysis of Khatri-Rao Random Projections and ApplicationsArvind K. Saibaba, Bhisham Dev Verma, Grey Ballard
Randomization has emerged as a powerful set of tools for large-scale matrix and tensor decompositions. Randomized algorithms involve computing sketches with random matrices. A prevalent approach is to take the random matrix as a standard Gaussian random matrix, for which the theory is well developed. However, this approach has the drawback that the cost of generating and multiplying by the random matrix can be prohibitively expensive. Khatri-Rao random projections (KRPs), obtained by sketching with Khatri-Rao products of random matrices, offer a viable alternative and are much cheaper to generate. However, the theoretical guarantees of using KRPs are much more pessimistic compared to their accuracy observed in practice. We attempt to close this gap by obtaining improved analysis of the use of KRPs in matrix and tensor low-rank decompositions. We propose and analyze a new algorithm for low-rank approximations of block-structured matrices (e.g., block Hankel) using KRPs. We also show how to accelerate tensor computations in the Tucker format using KRPs and give theoretical guarantees of the resulting low-rank approximations. Numerical experiments on synthetic and real-world tensors show the computational benefits of the proposed methods.
NAMay 13
A Majorization-Minimization with Monte Carlo Approach for Hyperparameter EstimationElle Buser, Julianne Chung, Hugo Díaz et al.
We consider inverse problems with linear forward models and Gaussian priors, but with unknown hyperparameters that may arise from the model, the noise, or the specification of the prior. We model this using a hierarchical Bayes framework resulting in a posterior distribution that is non-Gaussian, in general, and challenging to sample from. Consequently, we use an empirical Bayes framework for estimating the maximum a posteriori estimate of the hyperpameters by considering the marginalized posterior distribution. However, the optimization problem is also computationally challenging due to the need for repeated evaluation of log determinants. To address this issue, we propose a Majorization-Minimization with Monte Carlo approach, which we call M$^{3}$C, for hyperparameter estimation. Specifically, we replace the challenging optimization problem with a sequence of simpler ones by utilizing a majorization function (or majorant) for the log-determinant term, combined with a Monte Carlo estimator to approximate the majorant. We provide theoretical results, showing that under certain assumptions, the M$^{3}$C iterates converge with high probability to a critical point of the original cost function. A variety of numerical examples are provided from seismic tomography, super-resolution imaging, and contaminant source identification.
MLAug 7, 2025
Stochastic Trace Optimization of Parameter Dependent Matrices Based on Statistical Learning TheoryArvind K. Saibaba, Ilse C. F. Ipsen
We consider matrices $\boldsymbol{A}(\boldsymbolθ)\in\mathbb{R}^{m\times m}$ that depend, possibly nonlinearly, on a parameter $\boldsymbolθ$ from a compact parameter space $Θ$. We present a Monte Carlo estimator for minimizing $\text{trace}(\boldsymbol{A}(\boldsymbolθ))$ over all $\boldsymbolθ\inΘ$, and determine the sampling amount so that the backward error of the estimator is bounded with high probability. We derive two types of bounds, based on epsilon nets and on generic chaining. Both types predict a small sampling amount for matrices $\boldsymbol{A}(\boldsymbolθ)$ with small offdiagonal mass, and parameter spaces $Θ$ of small ``size.'' Dependence on the matrix dimension~$m$ is only weak or not explicit. The bounds based on epsilon nets are easier to evaluate and come with fully specified constants. In contrast, the bounds based on chaining depend on the Talagrand functionals which are difficult to evaluate, except in very special cases. Comparisons between the two types of bounds are difficult, although the literature suggests that chaining bounds can be superior.
NAMay 25, 2017
Efficient generalized Golub-Kahan based methods for dynamic inverse problemsJulianne Chung, Arvind K. Saibaba, Matthew Brown et al.
We consider efficient methods for computing solutions to and estimating uncertainties in dynamic inverse problems, where the parameters of interest may change during the measurement procedure. Compared to static inverse problems, incorporating prior information in both space and time in a Bayesian framework can become computationally intensive, in part, due to the large number of unknown parameters. In these problems, explicit computation of the square root and/or inverse of the prior covariance matrix is not possible. In this work, we develop efficient, iterative, matrix-free methods based on the generalized Golub-Kahan bidiagonalization that allow automatic regularization parameter and variance estimation. We demonstrate that these methods can be more flexible than standard methods and develop efficient implementations that can exploit structure in the prior, as well as possible structure in the forward model. Numerical examples from photoacoustic tomography, deblurring, and passive seismic tomography demonstrate the range of applicability and effectiveness of the described approaches. Specifically, in passive seismic tomography, we demonstrate our approach on both synthetic and real data. To demonstrate the scalability of our algorithm, we solve a dynamic inverse problem with approximately $43,000$ measurements and $7.8$ million unknowns in under $40$ seconds on a standard desktop.
COOct 18, 2016
Going off the Grid: Iterative Model Selection for Biclustered Matrix CompletionEric Chi, Liuiyi Hu, Arvind K. Saibaba et al.
We consider the problem of performing matrix completion with side information on row-by-row and column-by-column similarities. We build upon recent proposals for matrix estimation with smoothness constraints with respect to row and column graphs. We present a novel iterative procedure for directly minimizing an information criterion in order to select an appropriate amount row and column smoothing, namely perform model selection. We also discuss how to exploit the special structure of the problem to scale up the estimation and model selection procedure via the Hutchinson estimator. We present simulation results and an application to predicting associations in imaging-genomics studies.