NANAMar 27

Scalable s-step Preconditioned Conjugate Gradient with Chebyshev Basis and Gauss-Seidel Gram Solve

arXiv:2603.0979084.0h-index: 13
AI Analysis

This work addresses scalability challenges for linear solvers on modern accelerators, offering a stable alternative for high-performance computing applications, though it is incremental as it builds on existing s-step and preconditioning techniques.

The paper tackled the stability and scalability issues in s-step Preconditioned Conjugate Gradient methods by introducing a variant that uses a Chebyshev-stabilized Krylov basis and Forward Gauss-Seidel iteration for solving Gram systems, achieving convergence comparable to classical CG while reducing synchronization overhead on GPU architectures.

We present a variant of the s-step Preconditioned Conjugate Gradient (PCG) method that combines a Chebyshev-stabilized Krylov basis with a Forward Gauss-Seidel (FGS) iteration for the solution of the reduced Gram systems. In s-step Conjugate Gradient, multiple search directions are generated per outer iteration, reducing global synchronization costs but requiring the solution of small dense Gram systems whose conditioning is critical for stability. We analyze the structure of the Chebyshev Gram matrix and show that its moment-based representation is associated with favorable conditioning properties for moderate step sizes. Building on inexact Krylov theory and on the classical equivalence between FGS and Modified Gram-Schmidt (MGS), we provide a structural analysis and theoretical rationale supporting the use of a small number of FGS sweeps, while preserving the convergence behavior observed in practical regimes. Large-scale experiments on modern NVIDIA GPU architectures demonstrate that the proposed Chebyshev-stabilized, Gauss-Seidel-enhanced s-step PCG achieves convergence comparable to classical CG while reducing synchronization overhead, making it a stable and scalable alternative for current and next-generation accelerator systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes