NAApr 12, 2010
Communication-optimal Parallel and Sequential Cholesky DecompositionGrey Ballard, James Demmel, Olga Holtz et al.
Numerical algorithms have two kinds of costs: arithmetic and communication, by which we mean either moving data between levels of a memory hierarchy (in the sequential case) or over a network connecting processors (in the parallel case). Communication costs often dominate arithmetic costs, so it is of interest to design algorithms minimizing communication. In this paper we first extend known lower bounds on the communication cost (both for bandwidth and for latency) of conventional (O(n^3)) matrix multiplication to Cholesky factorization, which is used for solving dense symmetric positive definite linear systems. Second, we compare the costs of various Cholesky decomposition implementations to these lower bounds and identify the algorithms and data structures that attain them. In the sequential case, we consider both the two-level and hierarchical memory models. Combined with prior results in [13, 14, 15], this gives a set of communication-optimal algorithms for O(n^3) implementations of the three basic factorizations of dense linear algebra: LU with pivoting, QR and Cholesky. But it goes beyond this prior work on sequential LU by optimizing communication for any number of levels of memory hierarchy.
DSSep 11, 2012
Graph Expansion Analysis for Communication Costs of Fast Rectangular Matrix MultiplicationGrey Ballard, James Demmel, Olga Holtz et al.
Graph expansion analysis of computational DAGs is useful for obtaining communication cost lower bounds where previous methods, such as geometric embedding, are not applicable. This has recently been demonstrated for Strassen's and Strassen-like fast square matrix multiplication algorithms. Here we extend the expansion analysis approach to fast algorithms for rectangular matrix multiplication, obtaining a new class of communication cost lower bounds. These apply, for example to the algorithms of Bini et al. (1979) and the algorithms of Hopcroft and Kerr (1971). Some of our bounds are proved to be optimal.
NADec 24, 2007
Accurate and Efficient Expression Evaluation and Linear AlgebraJames Demmel, Ioana Dumitriu, Olga Holtz et al.
We survey and unify recent results on the existence of accurate algorithms for evaluating multivariate polynomials, and more generally for accurate numerical linear algebra with structured matrices. By "accurate" we mean that the computed answer has relative error less than 1, i.e., has some correct leading digits. We also address efficiency, by which we mean algorithms that run in polynomial time in the size of the input. Our results will depend strongly on the model of arithmetic: Most of our results will use the so-called Traditional Model (TM). We give a set of necessary and sufficient conditions to decide whether a high accuracy algorithm exists in the TM, and describe progress toward a decision procedure that will take any problem and provide either a high accuracy algorithm or a proof that none exists. When no accurate algorithm exists in the TM, it is natural to extend the set of available accurate operations by a library of additional operations, such as $x+y+z$, dot products, or indeed any enumerable set which could then be used to build further accurate algorithms. We show how our accurate algorithms and decision procedure for finding them extend to this case. Finally, we address other models of arithmetic, and the relationship between (im)possibility in the TM and (in)efficient algorithms operating on numbers represented as bit strings.
NAAug 28, 2007
Fast linear algebra is stableJames Demmel, Ioana Dumitriu, Olga Holtz
In an earlier paper, we showed that a large class of fast recursive matrix multiplication algorithms is stable in a normwise sense, and that in fact if multiplication of $n$-by-$n$ matrices can be done by any algorithm in $O(n^{ω+ η})$ operations for any $η> 0$, then it can be done stably in $O(n^{ω+ η})$ operations for any $η> 0$. Here we extend this result to show that essentially all standard linear algebra operations, including LU decomposition, QR decomposition, linear equation solving, matrix inversion, solving least squares problems, (generalized) eigenvalue problems and the singular value decomposition can also be done stably (in a normwise sense) in $O(n^{ω+ η})$ operations.
NADec 7, 2006
Fast matrix multiplication is stableJames Demmel, Ioana Dumitriu, Olga Holtz et al.
We perform forward error analysis for a large class of recursive matrix multiplication algorithms in the spirit of [D. Bini and G. Lotti, Stability of fast algorithms for matrix multiplication, Numer. Math. 36 (1980), 63--72]. As a consequence of our analysis, we show that the exponent of matrix multiplication (the optimal running time) can be achieved by numerically stable algorithms. We also show that new group-theoretic algorithms proposed in [H. Cohn, and C. Umans, A group-theoretic approach to fast matrix multiplication, FOCS 2003, 438--449] and [H. Cohn, R. Kleinberg, B. Szegedy and C. Umans, Group-theoretic algorithms for matrix multiplication, FOCS 2005, 379--388] are all included in the class of algorithms to which our analysis applies, and are therefore numerically stable. We perform detailed error analysis for three specific fast group-theoretic algorithms.
NAJan 18, 2006
Toward accurate polynomial evaluation in rounded arithmeticJames Demmel, Ioana Dumitriu, Olga Holtz
Given a multivariate real (or complex) polynomial $p$ and a domain $\cal D$, we would like to decide whether an algorithm exists to evaluate $p(x)$ accurately for all $x \in {\cal D}$ using rounded real (or complex) arithmetic. Here ``accurately'' means with relative error less than 1, i.e., with some correct leading digits. The answer depends on the model of rounded arithmetic: We assume that for any arithmetic operator $op(a,b)$, for example $a+b$ or $a \cdot b$, its computed value is $op(a,b) \cdot (1 + δ)$, where $| δ|$ is bounded by some constant $ε$ where $0 < ε\ll 1$, but $δ$ is otherwise arbitrary. This model is the traditional one used to analyze the accuracy of floating point algorithms.Our ultimate goal is to establish a decision procedure that, for any $p$ and $\cal D$, either exhibits an accurate algorithm or proves that none exists. In contrast to the case where numbers are stored and manipulated as finite bit strings (e.g., as floating point numbers or rational numbers) we show that some polynomials $p$ are impossible to evaluate accurately. The existence of an accurate algorithm will depend not just on $p$ and $\cal D$, but on which arithmetic operators and which constants are are available and whether branching is permitted. Toward this goal, we present necessary conditions on $p$ for it to be accurately evaluable on open real or complex domains ${\cal D}$. We also give sufficient conditions, and describe progress toward a complete decision procedure. We do present a complete decision procedure for homogeneous polynomials $p$ with integer coefficients, ${\cal D} = \C^n$, and using only the arithmetic operations $+$, $-$ and $\cdot$.
RADec 27, 2005
Not all GKK $τ$-matrices are stableOlga Holtz
Hermitian positive definite, totally positive, and nonsingular M-matrices enjoy many common properties, in particular: (A) positivity of all principal minors, (B) weak sign symmetry, (C) eigenvalue monotonicity, (D) positive stability. The class of GKK matrices is defined by properties (A) and (B), whereas the class of nonsingular $τ$-matrices by (A) and (C). It was conjectured that: (A), (B) implies (D) [D. Carlson, J. Res. Nat. Bur. Standards Sect. B 78 (1974) 1-2], (A), (C) implies (D) [G.M. Engel and H. Schneider, Linear and Multilinear Algebra 4 (1976) 155-176], (A), (B) implies a property stronger than (D) [R. Varga, Numerical Methods in Linear Algebra, 1978, pp. 5-15], (A), (B), (C) implies (D) [D. Hershkowitz, Linear Algebra Appl. 171 (1992) 161-186]. We describe a class of unstable GKK $τ$-matrices, thus disproving all four conjectures.
RADec 27, 2005
On convergence of infinite matrix productsOlga Holtz
A necessary and sufficient condition for the convergence of an infinite right product of matrices of the form | I B | A = | 0 C |, with (uniformly) contracting submatrices $C$, is proven.
RAMay 10, 2005
The inverse eigenvalue problem for symmetric anti-bidiagonal matricesOlga Holtz
The inverse eigenvalue problem for real symmetric matrices of the form 0 0 0 . 0 0 * 0 0 0 . 0 * * 0 0 0 . * * 0 . . . . . . . 0 0 * . 0 0 0 0 * * . 0 0 0 * * 0 . 0 0 0 is solved. The solution is shown to be unique. The problem is also shown to be equivalent to the inverse eigenvalue problem for a certain subclass of Jacobi matrices.