LGDSMay 25, 2023

Sliding Window Sum Algorithms for Deep Neural Networks

arXiv:2305.16513v13 citations
Originality Incremental advance
AI Analysis

This work addresses computational bottlenecks in deep learning for researchers and practitioners, offering incremental improvements in efficiency for specific operations.

The paper tackles the problem of improving computational efficiency in deep neural networks by applying sliding window sum algorithms to convolution and pooling operations, achieving speedups of O(P/w) and O(P/log(w)) and demonstrating that these kernels outperform traditional GEMM kernels on CPUs and even some GPU counterparts.

Sliding window sums are widely used for string indexing, hashing and time series analysis. We have developed a family of the generic vectorized sliding sum algorithms that provide speedup of O(P/w) for window size $w$ and number of processors P. For a sum with a commutative operator the speedup is improved to O(P/log(w)). Even more important, our algorithms exhibit efficient memory access patterns. In this paper we study the application of the sliding sum algorithms to the training and inference of the Deep Neural Networks. We demonstrate how both pooling and convolution primitives could be expressed as sliding sums and evaluated by the compute kernels with the shared structure. We show that the sliding sum convolution kernels are more efficient than the commonly used GEMM kernels on the CPU, and could even outperform their GPU counterparts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes