COMP-PHIMLGNov 24, 2023

Differentiable and accelerated spherical harmonic and Wigner transforms

arXiv:2311.14670v314 citationsh-index: 12
Originality Highly original
AI Analysis

This work addresses the need for fast and differentiable transforms in fields like machine learning and scientific computing that handle spherical data, representing a strong specific gain rather than a foundational advancement.

The paper tackles the problem of efficiently computing spherical harmonic and Wigner transforms, which are essential for analyzing spherical data, by developing novel algorithmic structures that are both accelerated and differentiable. The result includes up to a 400-fold acceleration compared to alternative codes and near-optimal linear scaling on multiple GPUs, with computational errors at machine precision for certain samplings.

Many areas of science and engineering encounter data defined on spherical manifolds. Modelling and analysis of spherical data often necessitates spherical harmonic transforms, at high degrees, and increasingly requires efficient computation of gradients for machine learning or other differentiable programming tasks. We develop novel algorithmic structures for accelerated and differentiable computation of generalised Fourier transforms on the sphere $\mathbb{S}^2$ and rotation group $\text{SO}(3)$, i.e. spherical harmonic and Wigner transforms, respectively. We present a recursive algorithm for the calculation of Wigner $d$-functions that is both stable to high harmonic degrees and extremely parallelisable. By tightly coupling this with separable spherical transforms, we obtain algorithms that exhibit an extremely parallelisable structure that is well-suited for the high throughput computing of modern hardware accelerators (e.g. GPUs). We also develop a hybrid automatic and manual differentiation approach so that gradients can be computed efficiently. Our algorithms are implemented within the JAX differentiable programming framework in the S2FFT software code. Numerous samplings of the sphere are supported, including equiangular and HEALPix sampling. Computational errors are at the order of machine precision for spherical samplings that admit a sampling theorem. When benchmarked against alternative C codes we observe up to a 400-fold acceleration. Furthermore, when distributing over multiple GPUs we achieve very close to optimal linear scaling with increasing number of GPUs due to the highly parallelised and balanced nature of our algorithms. Provided access to sufficiently many GPUs our transforms thus exhibit an unprecedented effective linear time complexity.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes