Per-Gunnar Martinsson

NA
19papers
15citations
Novelty39%
AI Score38

19 Papers

COMar 19, 2011
An algorithm for the principal component analysis of large data sets

Nathan Halko, Per-Gunnar Martinsson, Yoel Shkolnisky et al.

Recently popularized randomized methods for principal component analysis (PCA) efficiently and reliably produce nearly optimal accuracy --- even on parallel processors --- unlike the classical (deterministic) alternatives. We adapt one of these randomized methods for use with data sets that are too large to be stored in random-access memory (RAM). (The traditional terminology is that our procedure works efficiently "out-of-core.") We illustrate the performance of the algorithm via several numerical examples. For example, we report on the PCA of a data set stored on disk that is so large that less than a hundredth of it can fit in our computer's RAM.

NAMay 26, 2011
A direct solver with O(N) complexity for integral equations on one-dimensional domains

Adrianna Gillman, Patrick Young, Per-Gunnar Martinsson

An algorithm for the direct inversion of the linear systems arising from Nystrom discretization of integral equations on one-dimensional domains is described. The method typically has O(N) complexity when applied to boundary integral equations (BIEs) in the plane with non-oscillatory kernels such as those associated with the Laplace and Stokes' equations. The scaling coefficient suppressed by the "big-O" notation depends logarithmically on the requested accuracy. The method can also be applied to BIEs with oscillatory kernels such as those associated with the Helmholtz and Maxwell equations; it is efficient at long and intermediate wave-lengths, but will eventually become prohibitively slow as the wave-length decreases. To achieve linear complexity, rank deficiencies in the off-diagonal blocks of the coefficient matrix are exploited. The technique is conceptually related to the H and H^2 matrix arithmetic of Hackbusch and co-workers, and is closely related to previous work on Hierarchically Semi-Separable matrices.

NAFeb 7, 2019
Randomized methods for matrix computations

Per-Gunnar Martinsson

The purpose of this text is to provide an accessible introduction to a set of recently developed algorithms for factorizing matrices. These new algorithms attain high practical speed by reducing the dimensionality of intermediate computations using randomized projections. The algorithms are particularly powerful for computing low-rank approximations to very large matrices, but they can also be used to accelerate algorithms for computing full factorizations of matrices. A key competitive advantage of the algorithms described is that they require less communication than traditional deterministic methods.

NAFeb 25, 2013
An O(N) algorithm for constructing the solution operator to 2D elliptic boundary value problems in the absence of body loads

Adrianna Gillman, Per-Gunnar Martinsson

The large sparse linear systems arising from the finite element or finite difference discretization of elliptic PDEs can be solved directly via, e.g., nested dissection or multifrontal methods. Such techniques reorder the nodes in the grid to reduce the asymptotic complexity of Gaussian elimination from $O(N^{2})$ to $O(N^{1.5})$ for typical problems in two dimensions. It has recently been demonstrated that the complexity can be further reduced to O(N) by exploiting structure in the dense matrices that arise in such computations (using, e.g., $\mathcal{H}$-matrix arithmetic). This paper demonstrates that such \textit{accelerated} nested dissection techniques become particularly effective for boundary value problems without body loads when the solution is sought for several different sets of boundary data, and the solution is required only near the boundary (as happens, e.g., in the computational modeling of scattering problems, or in engineering design of linearly elastic solids.

NAMar 3, 2017
randUTV: A blocked randomized algorithm for computing a rank-revealing UTV factorization

Per-Gunnar Martinsson, Gregorio Quintana-Orti, Nathan Heavner

This manuscript describes the randomized algorithm randUTV for computing a so called UTV factorization efficiently. Given a matrix $A$, the algorithm computes a factorization $A = UTV^{*}$, where $U$ and $V$ have orthonormal columns, and $T$ is triangular (either upper or lower, whichever is preferred). The algorithm randUTV is developed primarily to be a fast and easily parallelized alternative to algorithms for computing the Singular Value Decomposition (SVD). randUTV provides accuracy very close to that of the SVD for problems such as low-rank approximation, solving ill-conditioned linear systems, determining bases for various subspaces associated with the matrix, etc. Moreover, randUTV produces highly accurate approximations to the singular values of $A$. Unlike the SVD, the randomized algorithm proposed builds a UTV factorization in an incremental, single-stage, and non-iterative way, making it possible to halt the factorization process once a specified tolerance has been met. Numerical experiments comparing the accuracy and speed of randUTV to the SVD are presented. These experiments demonstrate that in comparison to column pivoted QR, which is another factorization that is often used as a relatively economic alternative to the SVD, randUTV compares favorably in terms of speed while providing far higher accuracy.

NADec 8, 2016
An accelerated Poisson solver based on multidomain spectral discretization

Tracy Babb, Adrianna Gillman, Sijia Hao et al.

This paper presents a numerical method for variable coefficient elliptic PDEs with mostly smooth solutions on two dimensional domains. The PDE is discretized via a multi-domain spectral collocation method of high local order (order 30 and higher have been tested and work well). Local mesh refinement results in highly accurate solutions even in the presence of local irregular behavior due to corner singularities, localized loads, etc. The system of linear equations attained upon discretization is solved using a direct (as opposed to iterative) solver with $O(N^{1.5})$ complexity for the factorization stage and $O(N \log N)$ complexity for the solve. The scheme is ideally suited for executing the elliptic solve required when parabolic problems are discretized via time-implicit techniques. In situations where the geometry remains unchanged between time-steps, very fast execution speeds are obtained since the solution operator for each implicit solve can be pre-computed.

NAJun 13, 2008
Rapid factorization of structured matrices via randomized sampling

Per-Gunnar Martinsson

Randomized sampling has recently been demonstrated to be an efficient technique for computing approximate low-rank factorizations of matrices for which fast methods for computing matrix vector products are available. This paper describes an extension of such techniques to a wider class of matrices that are not themselves rank-deficient, but have off-diagonal blocks that are. Such matrices arise frequently in numerical analysis and signal processing, and there exist several methods for rapidly performing algebraic operations (matrix-vector multiplications, matrix factorizations, matrix inversion, \textit{etc}) on them once low-rank approximations to all off-diagonal blocks have been constructed. The paper demonstrates that if such a matrix can be applied to a vector in O(N) time, where the matrix is of size $N\times N$, and if individual entries of the matrix can be computed rapidly, then in many cases, the task of constructing approximate low-rank factorizations for all off-diagonal blocks can be performed in $O(N k^{2})$ time, where $k$ is an upper bound for the numerical rank of the off-diagonal blocks.

NANov 12, 2018
HPS Accelerated Spectral Solvers for Time Dependent Problems

Tracy Babb, Per-Gunnar Martinsson, Daniel Appelo

A high-order convergent numerical method for solving linear and non-linear parabolic PDEs is presented. The time-stepping is done via an explicit, singly diagonally implicit Runge-Kutta (ESDIRK) method of order 4 or 5, and for the implicit solve, we use the recently developed "Hierarchial Poincare-Steklov (HPS)" method. The HPS method combines a multidomain spectral collocation discretization technique (a "patching method") with a nested-dissection type direct solver. In the context under consideration, the elliptic solve required in each time-step involves the same coefficient matrix, which makes the use of a direct solver particularly effective. The manuscript describes the methodology and presents numerical experiments.

NAJan 18, 2011
A high-order accurate discretization scheme for variable coefficient elliptic PDEs in the plane with smooth solutions

Per-Gunnar Martinsson

A discretization scheme for variable coefficient elliptic PDEs in the plane is presented. The scheme is based on high-order Gaussian quadratures and is designed for problems with smooth solutions, such as scattering problems involving soft scatterers. The resulting system of linear equations is very well suited to efficient direct solvers such as nested dissection and the more recently proposed accelerated nested dissection schemes with O(N) complexity.

NAMar 22
Accelerating a restarted Krylov method for matrix functions with randomization

Nicolas L. Guidotti, Per-Gunnar Martinsson, Juan A. Acebrón et al.

Many scientific applications require the evaluation of the action of the matrix function over a vector and the most common methods for this task are those based on the Krylov subspace. Since the orthogonalization cost and memory requirement can quickly become overwhelming as the basis grows, the Krylov method is often restarted after a few iterations. This paper proposes a new acceleration technique for restarted Krylov methods based on randomization. The numerical experiments show that the randomized method greatly outperforms the classical approach with the same level of accuracy. In fact, randomization can actually improve the convergence rate of restarted methods in some cases. The paper also compares the performance and stability of the randomized methods proposed so far for solving very large ill-conditioned problems, complementing the numerical analyses from previous studies.

NAMar 27, 2019
Efficient nuclear norm approximation via the randomized UTV algorithm

Nathan Heavner, Per-Gunnar Martinsson

The recently introduced algorithm randUTV provides a highly efficient technique for computing accurate approximations to all the singular values of a given matrix $A$. The original version of randUTV was designed to compute a full factorization of the matrix in the form $A = UTV^*$ where $U$ and $V$ are orthogonal matrices, and $T$ is upper triangular. The estimates to the singular values of $A$ appear along the diagonal of $T$. This manuscript describes how the randUTV algorithm can be modified when the only quantity of interest being sought is the vector of approximate singular values. The resulting method is particularly effective for computing the nuclear norm of $A$, or more generally, other Schatten-$p$ norms. The report also describes how to compute an estimate of the errors incurred, at essentially negligible cost.

NAMar 19
A stable and fast method for solving multibody scattering problems via the method of fundamental solutions

Yunhui Cai, Joar Bagge, Per-Gunnar Martinsson

The paper describes a numerical method for solving acoustic multibody scattering problems in two and three dimensions. The idea is to compute a highly accurate approximation to the scattering operator for each body through a local computation, and then use these scattering matrices to form a global linear system. The resulting coefficient matrix is relatively well-conditioned, even for problems involving a very large number of scatterers. The linear system is amenable to iterative solvers, and can readily be accelerated via fast algorithms for the matrix-vector multiplication such as the fast multipole method. The key point of the work is that the local scattering matrices can be constructed using potentially ill-conditioned techniques such as the method of fundamental solutions (MFS), while still maintaining scalability and numerical stability of the global solver. The resulting algorithm is simple, as the MFS is far simpler to implement than alternative techniques based on discretizing boundary integral equations using Nyström or Galerkin.

NAMay 14, 2013
An O(N) Direct Solver for Integral Equations on the Plane

Eduardo Corona, Per-Gunnar Martinsson, Denis Zorin

An efficient direct solver for volume integral equations with O(N) complexity for a broad range of problems is presented. The solver relies on hierarchical compression of the discretized integral operator, and exploits that off-diagonal blocks of certain dense matrices have numerically low rank. Technically, the solver is inspired by previously developed direct solvers for integral equations based on "recursive skeletonization" and "Hierarchically Semi-Separable" (HSS) matrices, but it improves on the asymptotic complexity of existing solvers by incorporating an additional level of compression. The resulting solver has optimal O(N) complexity for all stages of the computation, as demonstrated by both theoretical analysis and numerical examples. The computational examples further display good practical performance in terms of both speed and memory usage. In particular, it is demonstrated that even problems involving 10^{7} unknowns can be solved to precision 10^{-10} using a simple Matlab implementation of the algorithm executed on a single core.

AIApr 19, 2021
Randomized Algorithms for Scientific Computing (RASC)

Aydin Buluc, Tamara G. Kolda, Stefan M. Wild et al.

Randomized algorithms have propelled advances in artificial intelligence and represent a foundational research area in advancing AI for Science. Future advancements in DOE Office of Science priority areas such as climate science, astrophysics, fusion, advanced materials, combustion, and quantum computing all require randomized algorithms for surmounting challenges of complexity, robustness, and scalability. This report summarizes the outcomes of that workshop, "Randomized Algorithms for Scientific Computing (RASC)," held virtually across four days in December 2020 and January 2021.

MSFeb 17, 2020
Computing rank-revealing factorizations of matrices stored out-of-core

Nathan Heavner, Per-Gunnar Martinsson, Gregorio Quintana-Ortí

This paper describes efficient algorithms for computing rank-revealing factorizations of matrices that are too large to fit in RAM, and must instead be stored on slow external memory devices such as solid-state or spinning disk hard drives (out-of-core or out-of-memory). Traditional algorithms for computing rank revealing factorizations, such as the column pivoted QR factorization, or techniques for computing a full singular value decomposition of a matrix, are very communication intensive. They are naturally expressed as a sequence of matrix-vector operations, which become prohibitively expensive when data is not available in main memory. Randomization allows these methods to be reformulated so that large contiguous blocks of the matrix can be processed in bulk. The paper describes two distinct methods. The first is a blocked version of column pivoted Householder QR, organized as a "left-looking" method to minimize the number of write operations (which are more expensive than read operations on a spinning disk drive). The second method results in a so called UTV factorization which expresses a matrix $A$ as $A = U T V^*$ where $U$ and $V$ are unitary, and $T$ is triangular. This method is organized as an algorithm-by-blocks, in which floating point operations overlap read and write operations. The second method incorporates power iterations, and is exceptionally good at revealing the numerical rank; it can often be used as a substitute for a full singular value decomposition. Numerical experiments demonstrate that the new algorithms are almost as fast when processing data stored on a hard drive as traditional algorithms are for data stored in main memory. To be precise, the computational time for fully factorizing an $n\times n$ matrix scales as $cn^{3}$, with a scaling constant $c$ that is only marginally larger when the matrix is stored out of core.

NAAug 29, 2016
RSVDPACK: An implementation of randomized algorithms for computing the singular value, interpolative, and CUR decompositions of matrices on multi-core and GPU architectures

Sergey Voronin, Per-Gunnar Martinsson

RSVDPACK is a library of functions for computing low rank approximations of matrices. The library includes functions for computing standard (partial) factorizations such as the Singular Value Decomposition (SVD), and also so called "structure preserving" factorizations such as the Interpolative Decomposition (ID) and the CUR decomposition. The ID and CUR factorizations pick subsets of the rows/columns of a matrix to use as bases for its row/column space. Such factorizations preserve properties of the matrix such as sparsity or non-negativity, are helpful in data interpretation, and require in certain contexts less memory than a partial SVD. The package implements highly efficient computational algorithms based on randomized sampling, as described and analyzed in [N. Halko, P.G. Martinsson, J. Tropp, "Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions," SIAM Review, 53(2), 2011], and subsequent papers. This manuscript presents some modifications to the basic algorithms that improve performance and ease of use. The library is written in C and supports both multi-core CPU and GPU architectures.

NADec 14, 2010
Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions

Nathan Halko, Per-Gunnar Martinsson, Joel A. Tropp

Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed---either explicitly or implicitly---to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis.

NAFeb 9, 2010
A Direct Solver for the Rapid Solution of Boundary Integral Equations on Axisymmetric Surfaces in Three Dimensions

Patrick M. Young, Per-Gunnar Martinsson

A scheme for rapidly and accurately computing solutions to boundary integral equations (BIEs) on rotationally symmetric surfaces in three dimensions is presented. The scheme uses the Fourier transform to reduce the original BIE defined on a surface to a sequence of BIEs defined on a generating curve for the surface. It can handle loads that are not necessarily rotationally symmetric. Nystrom discretization is used to discretize the BIEs on the generating curve. The quadrature used is a high-order Gaussian rule that is modified near the diagonal to retain high-order accuracy for singular kernels. The reduction in dimensionality, along with the use of high-order accurate quadratures, leads to small linear systems that can be inverted directly via, e.g., Gaussian elimination. This makes the scheme particularly fast in environments involving multiple right hand sides. It is demonstrated that for BIEs associated with Laplace's equation, the kernel in the reduced equations can be evaluated very rapidly by exploiting recursion relations for Legendre functions. Numerical examples illustrate the performance of the scheme; in particular, it is demonstrated that for a BIE associated with Laplace's equation on a surface discretized using 320 000 points, the set-up phase of the algorithm takes 2 minutes on a standard desktop, and then solves can be executed in 0.5 seconds.

NAJun 29, 2007
A fast direct solver for network matrices

Per-Gunnar Martinsson

A fast direct inversion scheme for the large sparse systems of linear equations resulting from the discretization of elliptic partial differential equations in two dimensions is given. The scheme is described for the particular case of a discretization on a uniform square grid, but can be generalized to more general geometries. For a grid containing $N$ points, the scheme requires $O(N \log^{2}N)$ arithmetic operations and $O(N \log N)$ storage to compute an approximate inverse. If only a single solve is required, then the scheme requires only $O(\sqrt{N} \log N)$ storage; the same storage is sufficient for computing the Dirichlet-to-Neumann operator as well as other boundary-to-boundary operators. The scheme is illustrated with several numerical examples. For instance, a matrix of size $10^6 \times 10^6$ is inverted to seven digits accuracy in four minutes on a 2.8GHz P4 desktop PC.