NADSMSNAApr 29

Permutation-Avoiding FFT-Based Convolution

arXiv:2506.1271811.5h-index: 27
Predicted impact top 34% in NA · last 90 daysOriginality Incremental advance
AI Analysis

For developers of FFT libraries and users of convolution in scientific computing, this work proposes a practical optimization that improves performance by eliminating costly memory permutations.

The paper identifies that index-reversal permutations in FFT-based convolution degrade arithmetic intensity and shows that these permutations cancel out, enabling permutation-free implementations. Numerical experiments demonstrate performance improvements over state-of-the-art FFT-based convolution, suggesting that FFT libraries should adopt such kernels.

Fast Fourier Transform (FFT) libraries are widely used for evaluating discrete convolutions. Most FFT implementations follow some variant of the Cooley-Tukey framework, in which the transform is decomposed into butterfly operations and index-reversal permutations. While butterfly operations dominate the floating-point operation count, the memory access patterns induced by index-reversal permutations significantly degrade the FFT's arithmetic intensity. When performing discrete convolution, the three sets of index-reversal permutations which occur in FFT-based implementations using Cooley-Tukey frameworks cancel out, thus paving the way to implementations free of any permutation. To the best of our knowledge, such permutation-free variants of FFT-based discrete convolution are not commonly used in practice, making such kernels worth investigating. Here, we look into such permutation-avoiding convolution procedures for multi-dimensional cases within a general radix Cooley-Tukey framework. We perform numerical experiments to benchmark the algorithms presented against state-of-the-art FFT-based convolution implementations. Our results suggest that developers of FFT libraries should consider supporting permutation-avoiding convolution kernels.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes