Ilse C. F. Ipsen

NA
h-index50
16papers
302citations
Novelty50%
AI Score50

16 Papers

NAFeb 15, 2017
Randomized Matrix-free Trace and Log-Determinant Estimators

Arvind K. Saibaba, Alen Alexanderian, Ilse C. F. Ipsen

We present randomized algorithms for estimating the trace and deter- minant of Hermitian positive semi-definite matrices. The algorithms are based on subspace iteration, and access the matrix only through matrix vector products. We analyse the error due to randomization, for starting guesses whose elements are Gaussian or Rademacher random variables. The analysis is cleanly separated into a structural (deterministic) part followed by a probabilistic part. Our absolute bounds for the expectation and concentration of the estimators are non-asymptotic and informative even for matrices of low dimension. For the trace estimators, we also present asymptotic bounds on the number of samples (columns of the starting guess) required to achieve a user-specified relative error. Numerical experiments illustrate the performance of the estimators and the tightness of the bounds on low-dimensional matrices; and on a challenging application in uncertainty quantification arising from Bayesian optimal experimental design.

NAJan 2, 2018
Low-Rank Matrix Approximations Do Not Need a Singular Value Gap

Petros Drineas, Ilse C. F. Ipsen

This is a systematic investigation into the sensitivity of low-rank approximations of real matrices. We show that the low-rank approximation errors, in the two-norm, Frobenius norm and more generally, any Schatten p-norm, are insensitive to additive rank-preserving perturbations in the projector basis; and to matrix perturbations that are additive or change the number of columns (including multiplicative perturbations). Thus, low-rank matrix approximations are always well-posed and do not require a singular value gap. In the presence of a singular value gap, connections are established between low-rank approximations and subspace angles.

NAMay 2, 2011
Determinant Approximations

Ilse C. F. Ipsen, Dean J. Lee

A sequence of approximations for the determinant and its logarithm of a complex matrixis derived, along with relative error bounds. The determinant approximations are derived from expansions of det(X)=exp(trace(log(X))), and they apply to non-Hermitian matrices. Examples illustrate that these determinant approximations are efficient for lattice simulations of finite temperature nuclear matter, and that they use significantly less space than Gaussian elimination. The first approximation in the sequence is a block diagonal approximation; it represents an extension of Fischer's and Hadamard's inequalities to non-Hermitian matrices. In the special case of Hermitian positive-definite matrices, block diagonal approximations can be competitive with sparse inverse approximations. At last, a different representation of sparse inverse approximations is given and it is shown that their accuracy increases as more matrix elements are included.

NAMar 5, 2014
The Effect of Coherence on Sampling from Matrices with Orthonormal Columns, and Preconditioned Least Squares Problems

Ilse C. F. Ipsen, Thomas Wentworth

Motivated by the least squares solver Blendenpik, we investigate three strategies for uniform sampling of rows from m x n matrices Q with orthonormal columns. The goal is to determine, with high probability, how many rows are required so that the sampled matrices have full rank and are well-conditioned with respect to inversion. Extensive numerical experiments illustrate that the three sampling strategies (without replacement, with replacement, and Bernoulli sampling) behave almost identically, for small to moderate amounts of sampling. In particular, sampled matrices of full rank tend to have two-norm condition numbers of at most 10. We derive a bound on the condition number of the sampled matrices in terms of the coherence μ of Q. This bound applies to all three different sampling strategies; it implies a, not necessarily tight, lower bound of O(m μ ln(n)) for the number of sampled rows; and it is realistic and informative even for matrices of small dimension and the stringent requirement of a 99 percent success probability. For uniform sampling with replacement we derive a potentially tighter condition number bound in terms of the leverage scores of Q. To obtain a more easily computable version of this bound, in terms of just the largest leverage scores, we first derive a general bound on the two-norm of diagonally scaled matrices. To facilitate the numerical experiments and test the tightness of the bounds, we present algorithms to generate matrices with user-specified coherence and leverage scores. These algorithms, the three sampling strategies, and a large variety of condition number bounds are implemented in the Matlab toolbox kappa_SQ_v3.

NAJan 2, 2018
A Probabilistic Subspace Bound with Application to Active Subspaces

John T. Holodnak, Ilse C. F. Ipsen, Ralph C. Smith

Given a real symmetric positive semi-definite matrix E, and an approximation S that is a sum of n independent matrix-valued random variables, we present bounds on the relative error in S due to randomization. The bounds do not depend on the matrix dimensions but only on the numerical rank (intrinsic dimension) of E. Our approach resembles the low-rank approximation of kernel matrices from random features, but our accuracy measures are more stringent. In the context of parameter selection based on active subspaces, where S is computed via Monte Carlo sampling, we present a bound on the number of samples so that with high probability the angle between the dominant subspaces of E and S is less than a user-specified tolerance. This is a substantial improvement over existing work, as it is a non-asymptotic and fully explicit bound on the sampling amount n, and it allows the user to tune the success probability. It also suggests that Monte Carlo sampling can be efficient in the presence of many parameters, as long as the underlying function f is sufficiently smooth.

NAApr 19, 2011
La Budde's Method for Computing Characteristic Polynomials

Rizwana Rehman, Ilse C. F. Ipsen

La Budde's method computes the characteristic polynomial of a real matrix A in two stages: first it applies orthogonal similarity transformations to reduce A to upper Hessenberg form H, and second it computes the characteristic polynomial of H from characteristic polynomials of leading principal submatrices of H. If A is symmetric, then H is symmetric tridiagonal, and La Budde's method simplifies to the Sturm sequence method. If A is diagonal then La Budde's method reduces to the Summation Algorithm, a Horner-like scheme used by the MATLAB function POLY to compute characteristic polynomials from eigenvalues. We present recursions to compute the individual coefficients of the characteristic polynomial in the second stage of La Budde's method, and derive running error bounds for symmetric and nonsymmetric matrices. We also show that La Budde's method can be more accurate than POLY, especially for indefinite and nonsymmetric matrices A. Unlike POLY, La Budde's method is not affected by illconditioning of eigenvalues, requires only real arithmetic, and allows the computation of individual coefficients.

4.7OPTICSApr 20
Two-Dimensional Tomography and Fourier Analysis

Andre Mas, Fatma Terzioglu, Ilse C. F. Ipsen

We highlight the important role of the Fourier transform in deriving inversion formulas for the integral transforms of tomographic imaging. We demonstrate this principle by deriving inversion formulas for the divergent beam transform and the V-line transform, the latter arising in contemporary models of single-scattering optical tomography.

10.2NAApr 10
Many (most?) column subset selection criteria are NP hard for a few columns

Ilse C. F. Ipsen, Arvind K. Saibaba

We consider a variety of criteria for selecting k representative columns from a real mxn matrix A, when sufficiently few columns are required, i.e., 1<= k<= min{rank(A), m/3}. The criteria include the following optimization problems: absolute volume and S-optimality maximization; norm, pseudo-inverse norm, and condition minimization number in the two-norm, Frobenius norm and Schatten p-norms for p>2; stable rank maximization; and the new criterion of relative volume maximization, which is inversely proportional to a power of the condition number. We show that these criteria are NP hard and many do not admit polynomial time approximation schemes (PTAS). To formulate the optimization problems as decision problems, we derive optimal values for the subset selection criteria, as well as expressions for partitioned pseudo-inverses. The results for minimization of the pseudo-inverse in the Frobenius norm are applicable to trace optimization in A-optimal design.

69.6NAMar 17
Perturbation Analysis for Preconditioned Normal Equations in Mixed Precision

James E. Garrison, Ilse C. F. Ipsen

For real matrices of full column-rank, we analyze the conditioning of several types of normal equations that are preconditioned by a randomized preconditioner computed in lower precision. These include symmetrically preconditioned normal equations, half-preconditioned normal equations, seminormal equations and not-normal equations. Our perturbation bounds are realistic and informative, and suggest that the conditioning depends only mildly on the quality of the preconditioner; however, it does depend on the size of the least squares residual -- even if the normal equations do not originate from a least squares problem. We illustrate that a randomized preconditioner can deliver a solution accuracy comparable to that of Matlab's mldivide command, is efficient in practice, and well-suited to GPU implementations. For the computation of the preconditioner, we propose an automatic selection of the precision, based on a fast condition number estimation in lower precision.

LGMar 18, 2024
Stochastic Rounding Implicitly Regularizes Tall-and-Thin Matrices

Gregory Dexter, Christos Boutsikas, Linkai Ma et al.

Motivated by the popularity of stochastic rounding in the context of machine learning and the training of large-scale deep neural network models, we consider stochastic nearness rounding of real matrices $\mathbf{A}$ with many more rows than columns. We provide novel theoretical evidence, supported by extensive experimental evaluation that, with high probability, the smallest singular value of a stochastically rounded matrix is well bounded away from zero -- regardless of how close $\mathbf{A}$ is to being rank deficient and even if $\mathbf{A}$ is rank-deficient. In other words, stochastic rounding \textit{implicitly regularizes} tall and skinny matrices $\mathbf{A}$ so that the rounded version has full column rank. Our proofs leverage powerful results in random matrix theory, and the idea that stochastic rounding errors do not concentrate in low-dimensional column spaces.

MLAug 7, 2025
Stochastic Trace Optimization of Parameter Dependent Matrices Based on Statistical Learning Theory

Arvind K. Saibaba, Ilse C. F. Ipsen

We consider matrices $\boldsymbol{A}(\boldsymbolθ)\in\mathbb{R}^{m\times m}$ that depend, possibly nonlinearly, on a parameter $\boldsymbolθ$ from a compact parameter space $Θ$. We present a Monte Carlo estimator for minimizing $\text{trace}(\boldsymbol{A}(\boldsymbolθ))$ over all $\boldsymbolθ\inΘ$, and determine the sampling amount so that the backward error of the estimator is bounded with high probability. We derive two types of bounds, based on epsilon nets and on generic chaining. Both types predict a small sampling amount for matrices $\boldsymbol{A}(\boldsymbolθ)$ with small offdiagonal mass, and parameter spaces $Θ$ of small ``size.'' Dependence on the matrix dimension~$m$ is only weak or not explicit. The bounds based on epsilon nets are easier to evaluate and come with fully specified constants. In contrast, the bounds based on chaining depend on the Talagrand functionals which are difficult to evaluate, except in very special cases. Comparisons between the two types of bounds are difficult, although the literature suggests that chaining bounds can be superior.

MEDec 23, 2020
Probabilistic Iterative Methods for Linear Systems

Jon Cockayne, Ilse C. F. Ipsen, Chris J. Oates et al.

This paper presents a probabilistic perspective on iterative methods for approximating the solution $\mathbf{x}_* \in \mathbb{R}^d$ of a nonsingular linear system $\mathbf{A} \mathbf{x}_* = \mathbf{b}$. In the approach a standard iterative method on $\mathbb{R}^d$ is lifted to act on the space of probability distributions $\mathcal{P}(\mathbb{R}^d)$. Classically, an iterative method produces a sequence $\mathbf{x}_m$ of approximations that converge to $\mathbf{x}_*$. The output of the iterative methods proposed in this paper is, instead, a sequence of probability distributions $μ_m \in \mathcal{P}(\mathbb{R}^d)$. The distributional output both provides a "best guess" for $\mathbf{x}_*$, for example as the mean of $μ_m$, and also probabilistic uncertainty quantification for the value of $\mathbf{x}_*$ when it has not been exactly determined. Theoretical analysis is provided in the prototypical case of a stationary linear iterative method. In this setting we characterise both the rate of contraction of $μ_m$ to an atomic measure on $\mathbf{x}_*$ and the nature of the uncertainty quantification being provided. We conclude with an empirical illustration that highlights the insight into solution uncertainty that can be provided by probabilistic iterative methods.

MLAug 17, 2018
A Projector-Based Approach to Quantifying Total and Excess Uncertainties for Sketched Linear Regression

Jocelyn T. Chi, Ilse C. F. Ipsen

Linear regression is a classic method of data analysis. In recent years, sketching -- a method of dimension reduction using random sampling, random projections, or both -- has gained popularity as an effective computational approximation when the number of observations greatly exceeds the number of variables. In this paper, we address the following question: How does sketching affect the statistical properties of the solution and key quantities derived from it? To answer this question, we present a projector-based approach to sketched linear regression that is exact and that requires minimal assumptions on the sketching matrix. Therefore, downstream analyses hold exactly and generally for all sketching schemes. Additionally, a projector-based approach enables derivation of key quantities from classic linear regression that account for the combined model- and algorithm-induced uncertainties. We demonstrate the usefulness of a projector-based approach in quantifying and enabling insight on excess uncertainties and bias-variance decompositions for sketched linear regression. Finally, we demonstrate how the insights from our projector-based analyses can be used to produce practical sketching diagnostics to aid the design of judicious sketching schemes.

COOct 17, 2018
Probabilistic Linear Solvers: A Unifying View

Simon Bartels, Jon Cockayne, Ilse C. F. Ipsen et al.

Several recent works have developed a new, probabilistic interpretation for numerical algorithms solving linear systems in which the solution is inferred in a Bayesian framework, either directly or by inferring the unknown action of the matrix inverse. These approaches have typically focused on replicating the behavior of the conjugate gradient method as a prototypical iterative method. In this work surprisingly general conditions for equivalence of these disparate methods are presented. We also describe connections between probabilistic linear solvers and projection methods for linear systems, providing a probabilistic interpretation of a far more general class of iterative methods. In particular, this provides such an interpretation of the generalised minimum residual method. A probabilistic view of preconditioning is also introduced. These developments unify the literature on probabilistic linear solvers, and provide foundational connections to the literature on iterative solvers for linear systems.

NAMay 23, 2015
Conditioning of Leverage Scores and Computation by QR Decomposition

John T. Holodnak, Ilse C. F. Ipsen, Thomas A. Wentworth

The leverage scores of a full-column rank matrix A are the squared row norms of any orthonormal basis for range(A). We show that corresponding leverage scores of two matrices A and A + ΔA are close in the relative sense, if they have large magnitude and if all principal angles between the column spaces of A and A + ΔA are small. We also show three classes of bounds that are based on perturbation results of QR decompositions. They demonstrate that relative differences between individual leverage scores strongly depend on the particular type of perturbation ΔA. The bounds imply that the relative accuracy of an individual leverage score depends on: its magnitude and the two-norm condition of A, if ΔA is a general perturbation; the two-norm condition number of A, if ΔA is a perturbation with the same norm-wise row-scaling as A; (to first order) neither condition number nor leverage score magnitude, if ΔA is a component-wise row-scaled perturbation. Numerical experiments confirm the qualitative and quantitative accuracy of our bounds.

NAOct 5, 2013
Randomized Approximation of the Gram Matrix: Exact Computation and Probabilistic Bounds

John T. Holodnak, Ilse C. F. Ipsen

Given a real matrix A with n columns, the problem is to approximate the Gram product AA^T by c << n weighted outer products of columns of A. Necessary and sufficient conditions for the exact computation of AA^T (in exact arithmetic) from c >= rank(A) columns depend on the right singular vector matrix of A. For a Monte-Carlo matrix multiplication algorithm by Drineas et al. that samples outer products, we present probabilistic bounds for the 2-norm relative error due to randomization. The bounds depend on the stable rank or the rank of A, but not on the matrix dimensions. Numerical experiments illustrate that the bounds are informative, even for stringent success probabilities and matrices of small dimension. We also derive bounds for the smallest singular value and the condition number of matrices obtained by sampling rows from orthonormal matrices.