Residual-based Chebyshev filtered subspace iteration for sparse Hermitian eigenvalue problems tolerant to inexact matrix-vector products
This work addresses computational efficiency for large-scale eigenvalue problems in fields like quantum chemistry, though it is incremental as it builds on existing Chebyshev filtered subspace iteration methods.
The paper tackles the problem of computing eigenpairs from large matrices when matrix-vector products are inexact, proposing R-ChFSI, a residual-based reformulation of Chebyshev filtered subspace iteration that achieves robust convergence and yields speedups of up to 2.7× on GPU accelerators while meeting target tolerances of 10^-8.
Chebyshev Filtered Subspace Iteration (ChFSI) is widely used for computing a small subset of extremal eigenpairs from large matrices, particularly when the eigenpairs must be computed repeatedly as the system matrix evolves within an outer nonlinear iteration. In this work, we propose R-ChFSI, a residual-based reformulation that recasts the Chebyshev polynomial recurrence in terms of residuals rather than eigenvector estimates, which achieves robust convergence even when matrix--vector products are computed inexactly. We derive convergence guarantees under such approximations and show that R-ChFSI can naturally leverage (i) the use of inexpensive approximate inverses for generalized eigenproblems of the form $\textbf{A} \textbf{x} = λ\textbf{B} \textbf{x}$, where exact factorizations of $\textbf{B}$ are prohibitively expensive, (ii)~low-precision arithmetic (FP32, TF32) for both standard and generalized eigenproblems, and (iii)~reduced-precision (BF16) inter-process communication in distributed sparse matrix--vector products. Controlled experiments on dense random matrices quantitatively verify the convergence bounds derived in this work and confirm the robustness of R-ChFSI to prescribed approximation errors for both standard and generalized eigenproblems. Large-scale experiments on finite-element discretized DFT generalized eigenproblems with up to 85 million grid points and 13,500 eigenpairs demonstrate that R-ChFSI achieves residual norms orders of magnitude below those of standard ChFSI when approximate inverses are employed, and reliably meets target tolerances of $10^{-8}$ even when employing reduced precision, yielding filtering speedups of up to $2.7{\times}$ ($2.1{\times}$ for the full eigensolver) on GPU accelerators