LGCVMLJun 26, 2019

Accelerating Large-Kernel Convolution Using Summed-Area Tables

arXiv:1906.11367v12 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency issues for researchers and practitioners in computer vision, particularly in human pose estimation, but is incremental as it builds on existing convolution methods.

The paper tackled the high computational cost of large-kernel convolutions in dense prediction tasks by using learnable box filters and summed-area tables to keep parameters constant and make computation independent of filter size, achieving competitive performance on human pose estimation benchmarks.

Expanding the receptive field to capture large-scale context is key to obtaining good performance in dense prediction tasks, such as human pose estimation. While many state-of-the-art fully-convolutional architectures enlarge the receptive field by reducing resolution using strided convolution or pooling layers, the most straightforward strategy is adopting large filters. This, however, is costly because of the quadratic increase in the number of parameters and multiply-add operations. In this work, we explore using learnable box filters to allow for convolution with arbitrarily large kernel size, while keeping the number of parameters per filter constant. In addition, we use precomputed summed-area tables to make the computational cost of convolution independent of the filter size. We adapt and incorporate the box filter as a differentiable module in a fully-convolutional neural network, and demonstrate its competitive performance on popular benchmarks for the task of human pose estimation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes