NAMar 2, 2013
Transformations of Matrix Structures Work AgainVictor Y. Pan
In 1989 we proposed to employ Vandermonde and Hankel multipliers to transform into each other the matrix structures of Toeplitz, Hankel, Vandermonde and Cauchy types as a means of extending any successful algorithm for the inversion of matrices having one of these structures to inverting the matrices with the structures of the three other types. Surprising power of this approach has been demonstrated in a number of works, which culminated in ingeneous numerically stable algorithms that approximated the solution of a nonsingular Toeplitz linear system in nearly linear (versus previuosly cubic) arithmetic time. We first revisit this powerful method, covering it comprehensively, and then specialize it to yield a similar acceleration of the known algorithms for computations with matrices having structures of Vandermonde or Cauchy types. In particular we arrive at numerically stable approximate multipoint polynomial evaluation and interpolation in nearly linear arithmetic time.
NAApr 17, 2017
Fast Approximate Computations with Cauchy Matrices and PolynomialsVictor Y. Pan
Multipoint polynomial evaluation and interpolation are fundamental for modern symbolic and numerical computing. The known algorithms solve both problems over any field of constants in nearly linear arithmetic time, but the cost grows to quadratic for numerical solution. We fix this discrepancy: our new numerical algorithms run in nearly linear arithmetic time. At first we restate our goals as the multiplication of an n-by-n Vandermonde matrix by a vector and the solution of a Vandermonde linear system of n equations. Then we transform the matrix into a Cauchy structured matrix with some special features. By exploiting them, we approximate the matrix by a generalized hierarchically semiseparable matrix, which is a structured matrix of a different class. Finally we accelerate our solution to the original problems by applying Fast Multipole Method to the latter matrix. Our resulting numerical algorithms run in nearly optimal arithmetic time when they perform the above fundamental computations with polynomials, Vandermonde matrices, transposed Vandermonde matrices, and a large class of Cauchy and Cauchy-like matrices. Some of our techniques may be of independent interest.
NAJun 5, 2016
Low-rank Approximation of a Matrix: Novel Insights, New Progress, and ExtensionsVictor Y. Pan, Liang Zhao
Low-rank approximation of a matrix by means of random sampling has been consistently efficient in its empirical studies by many scientists who applied it with various sparse and structured multipliers, but adequate formal support for this empirical phenomenon has been missing so far. Our novel insight into the subject leads to such an elusive formal support and promises significant acceleration of the known algorithms for some fundamental problems of matrix computations and data mining and analysis. Our formal results and our numerical tests are in good accordance with each other. We also outline extensions of low-rank approximation algorithms and of our progress to the Least Squares Regression, the Fast Multipole Method, and the Conjugate Gradient algorithms.
NAOct 28, 2012
Randomized Matrix ComputationsVictor Y. Pan, Guoliang Qian, Ai-Long Zheng
Random matrices tend to be well conditioned, and we employ this well known property to advance matrix computations. We prove that our algorithms employing Gaussian random matrices are efficient, but in our tests the algorithms have consistently remained as powerful where we used sparse and structured random matrices, defined by much fewer random parameters. We numerically stabilize Gaussian elimination with no pivoting as well as block Gaussian elimination, precondition an ill conditioned linear system of equations, compute numerical rank of a matrix without orthogonalization and pivoting, approximate the singular spaces of an ill conditioned matrix associated with its largest and smallest singular values, and approximate this matrix with low-rank matrices, with applications to its 2-by-2 block triangulation and to tensor decomposition. Some of our results and techniques can be of independent interest, e.g., our estimates for the condition numbers of random Toeplitz and circulant matrices and our variations of the Sherman--Morrison--Woodbury formula.
NADec 19, 2012
Condition Numbers of Random Toeplitz and Circulant MatricesVictor Y. Pan, Guoliang Qian
Estimating the condition numbers of random structured matrices is a well known challenge, linked to the design of efficient randomized matrix algorithms. We deduce such estimates for Gaussian random Toeplitz and circulant matrices. The former estimates can be surprising because the condition numbers grow exponentially in n as n grows to infinity for some large and important classes of n-by-n Toeplitz matrices, whereas we prove the opposit for Gaussian random Toeplitz matrices. Our formal estimates are in good accordance with our numerical tests, except that circulant matrices tend to be even better conditioned according to the tests than according to our formal study.
NAMar 13, 2018
Superfast CUR Matrix Algorithms, Their Pre-Processing and ExtensionsVictor Y. Pan, Qi Luan, John Svadlenka et al.
We study superfast algorithms that computes low rank approximation of a matrix (hereafter referred to as LRA) that use much fewer memory cells and arithmetic operations than the input matrix has entries. We first specify a family of 2mn matrices of size m*n such that for almost 50% of them any superfast LRA algorithm fails to improve the poor trivial approximation by the matrix filled with zeros, but then we prove that the class of all such hard inputs is narrow - the cross-approximation (hereafter {C-A}) superfast iterations as well as some more primitive superfast algorithms compute reasonably accurate LRAs in their transparent CUR form (i) to any matrix allowing close LRA except for small norm perturbations of matrices of an algebraic variety of a smaller dimension, (ii) to the average matrix allowing close LRA, (iii) to the average sparse matrix allowing close LRA and (iv) with a high probability to any matrix allowing close LRA if it is pre-processed fast with a random Gaussian, SRHT or SRFT multiplier. Moreover empirically the output LRAs remain accurate when we perform the computations superfast by replacing such a multiplier with one of our sparse and structured multipliers. Our techniques, auxiliary results and extensions may be of some independent interest. We analyze C-A and other superfast algorithms twice -- based on two well-known sufficient criteria for obtaining accurate LRAs. We provide a distinct proof in the case of superfast variant of randomized algorithms of [DMM08], improve a decade-old estimate for the norm of the inverse of a Gaussian matrix, prove such an estimate also in the case of a sparse Gaussian matrix, present some novel advanced pre-processing techniques for fast and superfast computation of LRA, and extend our results to dramatic acceleration of the Fast Multipole Method (FMM) and the Conjugate Gradient algorithms.
NAMar 16, 2016
New Studies of Randomized Augmentation and Additive PreprocessingVictor Y. Pan, Liang Zhao
1. A standard Gaussian random matrix has full rank with probability 1 and is well-conditioned with a probability quite close to 1 and converging to 1 fast as the matrix deviates from square shape and becomes more rectangular. 2. If we append sufficiently many standard Gaussian random rows or columns to any normalized matrix A, then the augmented matrix has full rank with probability 1 and is well-conditioned with a probability close to 1, even if the matrix A is rank deficient or ill-conditioned. 3. We specify and prove these properties of augmentation and extend them to additive preprocessing, that is, to adding a product of two rectangular Gaussian matrices. 4. By applying our randomization techniques to a matrix that has numerical rank r, we accelerate the known algorithms for the approximation of its leading and trailing singular spaces associated with its r largest and with all its remaining singular values, respectively. 5. Our algorithms use much fewer random parameters and run much faster when various random sparse and structured preprocessors replace Gaussian. Empirically the outputs of the resulting algorithms is as accurate as the outputs under Gaussian preprocessing. 6. Our novel duality techniques provides formal support, so far missing, for these empirical observations and opens door to derandomization of our preprocessing and to further acceleration and simplification of our algorithms by using more efficient sparse and structured preprocessors. 7. Our techniques and our progress can be applied to various other fundamental matrix computations such as the celebrated low-rank approximation of a matrix by means of random sampling.
NAJun 30, 2024
Polynomial Root-Finding and Algebraic Eigenvalue ProblemVictor Y. Pan
Univariate polynomial root-finding has been studied for four millennia and very intensively in the last decades. Our new near-optimal root-finders approximate all zeros of a polynomial p almost as fast as one accesses its coefficients with the precision required for the solution within a prescribed error bound. Furthermore, our root-finders can be applied to a black box polynomial, defined by an oracle (black box subroutine) for its evaluation rather than by its coefficients. Due to this feature our root-finders support approximation of the eigenvalues of a matrix in a record Las Vegas expected bit operation time and are particularly fast for a polynomial that can be evaluated fast such as the sum of a few shifted monomials or a Mandelbrot-like polynomial defined by a recurrence. Our divide and conquer algorithm of ACM STOC 1995 is the only other known near-optimal polynomial root-finder, but it extensively uses the coefficients, is quite involved, and has never been implemented, while according to extensive numerical experiments with standard test polynomials, already a slower initial implementation of our new root-finders competes with user's choice package of root-finding subroutines MPSolve and supersedes it more and more significantly as the degree of a polynomial grows large. We elaborate upon the design and analysis of our algorithms, comment on their potential heuristic acceleration, and briefly cover polynomial root-finding by means of functional iterations. Our techniques can be of independent interest.
NAFeb 3, 2018
Condition Estimation by Means of the Power MethodVictor Y. Pan
We estimate the condition number of a matrix by applying the Power Method, that is essentially a sequence of matrix-by-vector multiplications, similarly to the Lanczos method.
NAApr 13, 2017
Enhancing the Power of Cardinal's AlgorithmVictor Y. Pan
Cardinal's factorization algorithm of 1996 splits a univariate polynomial into two factors with root sets separated by the imaginary axis, which is an important goal itself and a basic step toward root-finding. The novelty of the algorithm and its potential power have been well recognized by experts immediately, but by 2016, that is, two decades later, its practical value still remains nil, particularly because of the high computational cost of performing its final stage by means of computing approximate greatest common divisor of two polynomials. We briefly recall Cardinal's algorithm and its difficulties, amend it based on some works performed since 1996, extend its power to splitting out factors of a more general class, and reduce the final stage of the algorithm to quite manageable computations with structured matrices. Some of our techniques can be of independent interest for matrix computations.
NADec 19, 2016
New Results on CGR/CUR Approximation of a MatrixVictor Y. Pan
CUR and low-rank approximations are among most fundamental subjects of numerical linear algebra, with a wide range of applications to a variety of highly important areas of modern computing, which range from the machine learning theory and neural networks to data mining and analysis. We first dramatically accelerate computation of such approximations for the average input matrix, then show some narrow classes of hard inputs for our algorithms, and finally point out a tentative direction to narrowing such classes further by means of pre-processing with quasi Gaussian structured multipliers. Our extensive numerical tests with a variety of real world inputs for regularization from Singular Matrix Database have consistently produced reasonably close CUR approximations at a low computational cost. There is a variety of efficient applications of our results and our techniques to important subjects of matrix computations. Our study provides new insights and enable design of faster algorithms for low-rank approximation by means of sampling and oversampling, for Gaussian elimination with no pivoting and block Gaussian elimination, and for the approximation of railing singular spaces associated with the $ν$ smallest singular values of a matrix having numerical nullity. We conclude the paper with novel extensions of our acceleration to the Fast Multipole Method and the Conjugate Gradient Algorithms. (This report is an outline of our paper in arXiv:1611.01391.)
11.5NAMay 5
New Combinations of Polynomial Root-Finding IterationsVictor Y. Pan
The near-optimal polynomial root-finders of 2024-25, based on subdivision iterations, approximate all complex roots of a polynomial or all roots lying in a fixed Region of Interest in the complex plane and can be applied to a black box polynomial, represented by an oracle (black box subroutine) for its evaluation rather than in monomial basis - by coefficients. Towards further empirical acceleration we combine them with two other popular root-finders, Ehrlich's (aka Aberth's) and modified Newton's. They have weaker formal support but compete empirically with user's current choice root-finder MPSolve for approximation of all complex roots under proper initialization that involve polynomial coefficients. Their combinations with subdivision iterations can be applied to a black box polynomial and to root-finding in a fixed Region and promise to support empirical acceleration versus each approach standing alone. For a by-product of our study, we naturally extend the Gauss-Lucas theorem; this can be of independent interest.
NASep 23, 2018
Superfast Low-Rank Approximation and Least Squares RegressionVictor Y. Pan, Qi Luan, John Svadlenka et al.
Low Rank Approximation is among most fundamental subjects of numerical linear algebra having important applications to various areas of modern computing and %they range from machine learning theory and %neural networks to data mining and analysis. The known algorithms compute such approximations by using more flops than the input matrix has entries, but we prove that much fewer flops than entries are sufficient in the case of the average input ("flop" stands for "floating point arithmetic operation"). We prove this twice -- for the solutions by means of two distinct algorithms, and we analyze them by applying two different approaches. Our analysis of both algorithms is quite involved, but we devise them mostly by simplifying, combining, and ameliorating the known techniques, although we propose some technical novelties for further enhancing the performance of the popular Cross-Approximation Algorithms. They are highly efficient empirically, and we prove that they are efficient for the average input. We specify some narrow classes of hard inputs for which the presented algorithms fail with high probability even when we randomize them, but we narrow such classes further by means of preprocessing with new sparse and structured multipliers. The average complexity estimates do not cover many realistic input classes, but our formal analysis is in good accordance with the results of our tests applied to benchmark inputs from discretized PDEs and Integral quations and to random inputs. Our work should already be of practical value but also leads to research challenges. At the end we list some of them, propose two novel extensions of our progress -- to the acceleration of the Fast Multipole Method and Conjugate Gradient algorithms, and explore and slightly extend the recent techniques of Osinsky, which enhance the output accuracy of CUR Approximation.
NAJul 10, 2015
How Bad Are Vandermonde Matrices?Victor Y. Pan
The work on the estimation of the condition numbers of Vandermonde matrices, motivated by applications to interpolation and quadrature, can be traced back at least to the 1970s. Empirical study has shown consistently that Vandermonde matrices tend to be badly ill-conditioned, with a narrow class of notable exceptions, such as the matrices of the discrete Fourier transform (hereafter referred to as DFT). So far formal support for this empirical observation, however, has been limited to the matrices defined by the real set of knots. We prove that, more generally, any Vandermonde matrix of a large size is badly ill-conditioned unless its knots are more or less equally spaced on or about the circle $C(0,1)=\{x:\,|x|=1\}$. The matrices of DFT are perfectly conditioned, being defined by a cyclic sequence of knots, equally spaced on that circle, but we prove that even a slight modification of the knots into the so called quasi-cyclic sequence on this circle defines badly ill-conditioned Vandermonde matrices. Likewise we prove that the half-size leading block of a large DFT matrix is badly ill-conditioned. (This result was motivated by an application to pre-conditioning of an input matrix for Gaussian elimination with no pivoting.) Our analysis involves the Ekkart--Young theorem, the Vandermonde-to-Cauchy transformation of matrix structure, our new inversion formula for a Cauchy matrix, and low-rank approximation of its large submatrices.
NAJun 15, 2015
Polynomial Root Isolation by Means of Root Radii ApproximationVictor Y. Pan, Liang Zhao
Univariate polynomial root-finding is a classical subject, still important for modern computing. Frequently one seeks just the real roots of a real coefficient polynomial. They can be approximated at a low computational cost if the polynomial has no nonreal roots, but for high degree polynomials, nonreal roots are typically much more numerous than the real ones. The challenge is known for long time, and the subject has been intensively studied. The Boolean cost bounds for the refinement of the simple and isolated real roots have been decreased to nearly optimal, but the success has been more limited at the stage of the isolation of real roots. We obtain substantial progress by applying the algorithm of of 1982 by Schoenhage for the approximation of the root radii, that is, the distances of the roots to the origin. Namely we isolate the simple and well-conditioned real roots of a polynomial at the Boolean cost dominated by the nearly optimal bounds for the refinement of such roots. We also extend our algorithm to the isolation of complex, possibly multiple, roots and root clusters staying within the same (nearly optimal) asymptotic Boolean cost bound. Our numerical tests with benchmark polynomials performed with the IEEE standard double precision show that our nearly optimal real root-finder is practically promising. Our techniques are simple, and their power and application range may increase in combination with the known efficient methods.
NADec 17, 2014
Random Multipliers Numerically Stabilize Gaussian and Block Gaussian Elimination: Proofs and an Extension to Low-rank ApproximationVictor Y. Pan, Guoliang Qian, Xiaodong Yan
We prove that standard Gaussian random multipliers are expected to numerically stabilize both Gaussian elimination with no pivoting and block Gaussian elimination. Moreover we prove that such a multiplier (even without the customary oversampling) is expected to support low-rank approximation of a matrix. Our test results are in good accordance with this analysis. Empirically random circulant or Toeplitz multipliers are as efficient as Gaussian ones, but their formal support is more problematic.
NANov 6, 2014
Better Late Than Never: Filling a Void in the History of Fast Matrix Multiplication and Tensor DecompositionsVictor Y. Pan
Multilinear and tensor decompositions are a popular tool in linear and multilinear algebra and have a wide range of important applications to modern computing. Our paper of 1972 presented the first nontrivial application of such decompositions to fundamental matrix computations and was also a landmark in the history of the acceleration of matrix multiplication. Published in 1972 in Russian, it has never been translated into English. It has been very rarely cited in the Western literature on matrix multiplication and never in the works on multilinear and tensor decompositions. This motivates us to present its translation into English, together with our brief comments on its impact on the two fields.