Michiel E. Hochstenbach

11papers

106citations

Novelty48%

AI Score41

Ranked #91,625 of 201,326 authors (top 46%)#286 in NA (top 50%)

11 Papers

83.8NAMay 31

Transpose-free linear algebra

Diana Halikias, Michiel E. Hochstenbach, Alex Townsend

We study the limitations of matrix-free algorithms that access a matrix $A$ only through forward matrix-vector products (matvecs) $x \mapsto Ax$, without access to the transpose $A^\top$ or its action. This setting arises naturally in operator learning, inverse problems, and matrix-free PDE solvers, where adjoint evaluations may be unavailable or prohibitively expensive. We show that the lack of transpose access creates severe and sometimes insurmountable theoretical barriers. For Krylov methods, we prove that the sequence of projected operator norms produced by Arnoldi iteration can follow any prescribed nondecreasing curve, showing that forward matvecs alone provide essentially no reliable information about the spectral norm. For several core problems, including least squares, norm estimation, column subset selection, and local maximum volume, we establish non-identifiability results; distinct matrices can generate identical forward-query transcripts while having fundamentally different solutions. We also prove quantitative lower bounds on the number of forward matvecs required for approximation tasks. In particular, any algorithm that computes a near-optimal rank-$k$ approximation must use at least $n$ queries, and estimating the Frobenius norm to relative accuracy $\eps$ requires $Ω(\eps^{-2})$ queries when $n$ is sufficiently large, matching the complexity of Hutchinson-type estimators up to constants. Although some problems remain solvable without transpose access, the transpose-free setting is fundamentally more limited in both identifiability and efficiency.

NAFeb 13, 2019

Fixing Nonconvergence of Algebraic Iterative Reconstruction with an Unmatched Backprojector

Yiqiu Dong, Per Christian Hansen, Michiel E. Hochstenbach et al.

We consider algebraic iterative reconstruction methods with applications in image reconstruction. In particular, we are concerned with methods based on an unmatched projector/backprojector pair; i.e., the backprojector is not the exact adjoint or transpose of the forward projector. Such situations are common in large-scale computed tomography, and we consider the common situation where the method does not converge due to the nonsymmetry of the iteration matrix. We propose a modified algorithm that incorporates a small shift parameter, and we give the conditions that guarantee convergence of this method to a fixed point of a slightly perturbed problem. We also give perturbation bounds for this fixed point. Moreover, we discuss how to use Krylov subspace methods to efficiently estimate the leftmost eigenvalue of a certain matrix to select a proper shift parameter. The modified algorithm is illustrated with test problems from computed tomography.

NASep 22, 2008

Polynomial two-parameter eigenvalue problems and matrix pencil methods for stability of delay-differential equations

Elias Jarlebring, Michiel E. Hochstenbach

Several recent methods used to analyze asymptotic stability of delay-differential equations (DDEs) involve determining the eigenvalues of a matrix, a matrix pencil or a matrix polynomial constructed by Kronecker products. Despite some similarities between the different types of these so-called matrix pencil methods, the general ideas used as well as the proofs differ considerably. Moreover, the available theory hardly reveals the relations between the different methods. In this work, a different derivation of various matrix pencil methods is presented using a unifying framework of a new type of eigenvalue problem: the polynomial two-parameter eigenvalue problem, of which the quadratic two-parameter eigenvalue problem is a special case. This framework makes it possible to establish relations between various seemingly different methods and provides further insight in the theory of matrix pencil methods. We also recognize a few new matrix pencil variants to determine DDE stability. Finally, the recognition of the new types of eigenvalue problem opens a door to efficient computation of DDE stability.

NAMar 18, 2019

Subspace Methods for 3-Parameter Eigenvalue Problems

Michiel E. Hochstenbach, Karl Meerbergen, Emre Mengi et al.

We propose subspace methods for 3-parameter eigenvalue problems. Such problems arise when separation of variables is applied to separable boundary value problems; a particular example is the Helmholtz equation in ellipsoidal and paraboloidal coordinates. While several subspace methods for 2-parameter eigenvalue problems exist, their extensions to three parameter setting seem to be challenging. An inherent difficulty is that, while for 2-parameter eigenvalue problems we can exploit a relation to Sylvester equations to obtain a fast Arnoldi type method, such a relation does not seem to exist when there are three or more parameters. Instead, we introduce a subspace iteration method with projections onto generalized Krylov subspaces that are constructed from scratch at every iteration using certain Ritz vectors as the initial vectors. Another possibility is a Jacobi--Davidson type method for three or more parameters, which we generalize from its 2-parameter counterpart. For both approaches, we introduce a selection criterion for deflation that is based on the angles between left and right eigenvectors. The Jacobi--Davidson approach is devised to locate eigenvalues close to a prescribed target, yet it often also performs well when eigenvalues are sought based on the proximity of one of the components to a prescribed target. The subspace iteration method is devised specifically for the latter task. The proposed approaches are suitable especially for problems where the computation of several eigenvalues is required with high accuracy. Matlab implementations of both methods have been made available in the package MultiParEig.

ACJul 17, 2016

Uniform determinantal representations

Ada Boralevi, Jasper van Doornmalen, Jan Draisma et al.

The problem of expressing a specific polynomial as the determinant of a square matrix of affine-linear forms arises from algebraic geometry, optimisation, complexity theory, and scientific computing. Motivated by recent developments in this last area, we introduce the notion of a uniform determinantal representation, not of a single polynomial but rather of all polynomials in a given number of variables and of a given maximal degree. We derive a lower bound on the size of the matrix, and present a construction achieving that lower bound up to a constant factor as the number of variables is fixed and the degree grows. This construction marks an improvement upon a recent construction due to Plestenjak-Hochstenbach, and we investigate the performance of new representations in their root-finding technique for bivariate systems. Furthermore, we relate uniform determinantal representations to vector spaces of singular matrices, and we conclude with a number of future research directions.

NAMay 17, 2017

Generalized Davidson and multidirectional-type methods for the generalized singular value decomposition

Ian N. Zwaan, Michiel E. Hochstenbach

We propose new iterative methods for computing nontrivial extremal generalized singular values and vectors. The first method is a generalized Davidson-type algorithm and the second method employs a multidirectional subspace expansion technique. Essential to the latter is a fast truncation step designed to remove a low quality search direction and to ensure moderate growth of the search space. Both methods rely on thick restarts and may be combined with two different deflation approaches. We argue that the methods have monotonic and (asymptotic) linear convergence, derive and discuss locally optimal expansion vectors, and explain why the fast truncation step ideally removes search directions orthogonal to the desired generalized singular vector. Furthermore, we identify the relation between our generalized Davidson-type algorithm and the Jacobi--Davidson algorithm for the generalized singular value decomposition. Finally, we generalize several known convergence results for the Hermitian eigenvalue problem to the Hermitian positive definite generalized eigenvalue problem. Numerical experiments indicate that both methods are competitive.

MLJan 23, 2023

On the Convergence of the Gradient Descent Method with Stochastic Fixed-point Rounding Errors under the Polyak-Lojasiewicz Inequality

Lu Xia, Michiel E. Hochstenbach, Stefano Massei

When training neural networks with low-precision computation, rounding errors often cause stagnation or are detrimental to the convergence of the optimizers; in this paper we study the influence of rounding errors on the convergence of the gradient descent method for problems satisfying the Polyak-\Lojasiewicz inequality. Within this context, we show that, in contrast, biased stochastic rounding errors may be beneficial since choosing a proper rounding strategy eliminates the vanishing gradient problem and forces the rounding bias in a descent direction. Furthermore, we obtain a bound on the convergence rate that is stricter than the one achieved by unbiased stochastic rounding. The theoretical analysis is validated by comparing the performances of various rounding strategies when optimizing several examples using low-precision fixed-point number formats.

LGFeb 24, 2022

On the influence of stochastic roundoff errors and their bias on the convergence of the gradient descent method with low-precision floating-point computation

Lu Xia, Stefano Massei, Michiel E. Hochstenbach et al.

When implementing the gradient descent method in low precision, the employment of stochastic rounding schemes helps to prevent stagnation of convergence caused by the vanishing gradient effect. Unbiased stochastic rounding yields zero bias by preserving small updates with probabilities proportional to their relative magnitudes. This study provides a theoretical explanation for the stagnation of the gradient descent method in low-precision computation. Additionally, we propose two new stochastic rounding schemes that trade the zero bias property with a larger probability to preserve small gradients. Our methods yield a constant rounding bias that, on average, lies in a descent direction. For convex problems, we prove that the proposed rounding methods typically have a beneficial effect on the convergence rate of gradient descent. We validate our theoretical analysis by comparing the performances of various rounding schemes when optimizing a multinomial logistic regression model and when training a simple neural network with an 8-bit floating-point format.

NAApr 1, 2019

Solving singular generalized eigenvalue problems by a rank-completing perturbation

Michiel E. Hochstenbach, Christian Mehl, Bor Plestenjak

Generalized eigenvalue problems involving a singular pencil are very challenging to solve, both with respect to accuracy and efficiency. The existing package Guptri is very elegant but may sometimes be time-demanding, even for small and medium-sized matrices. We propose a simple method to compute the eigenvalues of singular pencils, based on one perturbation of the original problem of a certain specific rank. For many problems, the method is both fast and robust. This approach may be seen as a welcome alternative to staircase methods.

NAJul 27, 2015

Krylov approximation of ODEs with polynomial parameterization

Antti Koskela, Elias Jarlebring, Michiel E. Hochstenbach

We propose a new numerical method to solve linear ordinary differential equations of the type $\frac{\partial u}{\partial t}(t,\varepsilon) = A(\varepsilon) \, u(t,\varepsilon)$, where $A:\mathbb{C}\rightarrow\mathbb{C}^{n\times n}$ is a matrix polynomial with large and sparse matrix coefficients. The algorithm computes an explicit parameterization of approximations of $u(t,\varepsilon)$ such that approximations for many different values of $\varepsilon$ and $t$ can be obtained with a very small additional computational effort. The derivation of the algorithm is based on a reformulation of the parameterization as a linear parameter-free ordinary differential equation and on approximating the product of the matrix exponential and a vector with a Krylov method. The Krylov approximation is generated with Arnoldi's method and the structure of the coefficient matrix turns out to have an independence on the truncation parameter so that it can also be interpreted as Arnoldi's method applied to an infinite dimensional matrix. We prove the superlinear convergence of the algorithm and provide a posteriori error estimates to be used as termination criteria. The behavior of the algorithm is illustrated with examples stemming from spatial discretizations of partial differential equations.

NAJun 7, 2015

Roots of bivariate polynomial systems via determinantal representations

Bor Plestenjak, Michiel E. Hochstenbach

We give two determinantal representations for a bivariate polynomial. They may be used to compute the zeros of a system of two of these polynomials via the eigenvalues of a two-parameter eigenvalue problem. The first determinantal representation is suitable for polynomials with scalar or matrix coefficients, and consists of matrices with asymptotic order $n^2/4$, where $n$ is the degree of the polynomial. The second representation is useful for scalar polynomials and has asymptotic order $n^2/6$. The resulting method to compute the roots of a system of two bivariate polynomials is competitive with some existing methods for polynomials up to degree 10, as well as for polynomials with a small number of terms.