Fast Parallel Randomized QR with Column Pivoting Algorithms for Reliable Low-rank Matrix Approximations
This work provides efficient and reliable low-rank matrix approximation algorithms for large-scale scientific computing, addressing communication bottlenecks in parallel environments.
The paper presents randomized QR with column pivoting (RQRCP) algorithms that achieve reliability comparable to QRCP with exponentially decaying failure probabilities, and develop distributed memory implementations significantly outperforming ScaLAPACK QRCP. They also introduce spectrum-revealing QR factorizations for low-rank approximations, demonstrating superior reliability and efficiency over leading methods.
Factorizing large matrices by QR with column pivoting (QRCP) is substantially more expensive than QR without pivoting, owing to communication costs required for pivoting decisions. In contrast, randomized QRCP (RQRCP) algorithms have proven themselves empirically to be highly competitive with high-performance implementations of QR in processing time, on uniprocessor and shared memory machines, and as reliable as QRCP in pivot quality. We show that RQRCP algorithms can be as reliable as QRCP with failure probabilities exponentially decaying in oversampling size. We also analyze efficiency differences among different RQRCP algorithms. More importantly, we develop distributed memory implementations of RQRCP that are significantly better than QRCP implementations in ScaLAPACK. As a further development, we introduce the concept of and develop algorithms for computing spectrum-revealing QR factorizations for low-rank matrix approximations, and demonstrate their effectiveness against leading low-rank approximation methods in both theoretical and numerical reliability and efficiency.