PSelInv - A Distributed Memory Parallel Algorithm for Selected Inversion: the non-symmetric Case
It addresses the need for scalable selected inversion of non-symmetric sparse matrices in high-performance computing, but is an incremental extension of prior work.
This paper extends the PSelInv parallel selected inversion algorithm to non-symmetric sparse matrices, achieving efficient scaling to 6,400 cores for various matrices.
This paper generalizes the parallel selected inversion algorithm called PSelInv to sparse non- symmetric matrices. We assume a general sparse matrix A has been decomposed as PAQ = LU on a distributed memory parallel machine, where L, U are lower and upper triangular matrices, and P, Q are permutation matrices, respectively. The PSelInv method computes selected elements of A-1. The selection is confined by the sparsity pattern of the matrix AT . Our algorithm does not assume any symmetry properties of A, and our parallel implementation is memory efficient, in the sense that the computed elements of A-T overwrites the sparse matrix L+U in situ. PSelInv involves a large number of collective data communication activities within different processor groups of various sizes. In order to minimize idle time and improve load balancing, tree-based asynchronous communication is used to coordinate all such collective communication. Numerical results demonstrate that PSelInv can scale efficiently to 6,400 cores for a variety of matrices.