Scaling to the stars -- a linearly scaling elliptic solver for $p$-multigrid
This work addresses the computational bottleneck of elliptic solvers in high-order methods for computational fluid dynamics, enabling efficient time-stepping for incompressible Navier-Stokes equations.
The authors derive a matrix-free explicit inverse for the static condensed operator in cuboidal subdomains, enabling a linearly scaling additive Schwarz smoother and a p-multigrid cycle with O(n_DOF) operation count. The solver achieves fewer than four iterations to reduce the residual by ten orders and runtime less than one microsecond per unknown for polynomial degrees up to 48.
High-order methods gain increased attention in computational fluid dynamics. However, due to the time step restrictions arising from the semi-implicit time stepping for the incompressible case, the potential advantage of these methods depends critically on efficient elliptic solvers. Due to the operation counts of operators scaling with with the polynomial degree $p$ times the number of degrees of freedom $n_{\mathrm{DOF}}$, the runtime of the best available multigrid solvers scales with $\mathcal{O}( p \cdot n_{\mathrm{DOF}})$. This scaling with $p$ significantly lowers the applicability of high-order methods to high orders. While the operators for residual evaluation can be linearized when using static condensation, Schwarz-type smoothers require their inverses on fixed subdomains. No explicit inverse is known in the condensed case and matrix-matrix multiplications scale with ${p \cdot n_{\mathrm{DOF}}}$. This paper derives a matrix-free explicit inverse for the static condensed operator in a cuboidal subdomain. It scales with $p^3$ per element, i.e. ${n_{\mathrm{DOF}}}$ globally, and allows for a linearly scaling additive Schwarz smoother, yielding a $p$-multigrid cycle with an operation count of $\mathcal{O}(n_{\mathrm{DOF}})$. The resulting solver uses fewer than four iterations for all polynomial degrees to reduce the residual by ten orders and has a runtime scaling linearly with ${n_{\mathrm{DOF}}}$ for polynomial degrees at least up to $48$. Furthermore the runtime is less than one microsecond per unknown over wide parameter ranges when using one core of a CPU, leading to time-stepping for the incompressible Navier-Stokes equations using as much time for explicitly treated convection terms as for the elliptic solvers.