CVSep 16, 2021

Dense Pruning of Pointwise Convolutions in the Frequency Domain

arXiv:2109.07707v13 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency improvements for convolutional neural networks, particularly in resource-constrained settings, by unifying two previously incompatible methods, though it is incremental as it builds on existing techniques like MobileNetV2.

The paper tackles the incompatibility between depthwise separable convolutions and frequency-domain convolutions by transforming activations instead of kernels, enabling pointwise convolutions to be computed in the frequency domain with selective pruning. This approach reduces computation time by 22% with less than 1% accuracy degradation when applied to MobileNetV2.

Depthwise separable convolutions and frequency-domain convolutions are two recent ideas for building efficient convolutional neural networks. They are seemingly incompatible: the vast majority of operations in depthwise separable CNNs are in pointwise convolutional layers, but pointwise layers use 1x1 kernels, which do not benefit from frequency transformation. This paper unifies these two ideas by transforming the activations, not the kernels. Our key insights are that 1) pointwise convolutions commute with frequency transformation and thus can be computed in the frequency domain without modification, 2) each channel within a given layer has a different level of sensitivity to frequency domain pruning, and 3) each channel's sensitivity to frequency pruning is approximately monotonic with respect to frequency. We leverage this knowledge by proposing a new technique which wraps each pointwise layer in a discrete cosine transform (DCT) which is truncated to selectively prune coefficients above a given threshold as per the needs of each channel. To learn which frequencies should be pruned from which channels, we introduce a novel learned parameter which specifies each channel's pruning threshold. We add a new regularization term which incentivizes the model to decrease the number of retained frequencies while still maintaining task accuracy. Unlike weight pruning techniques which rely on sparse operators, our contiguous frequency band pruning results in fully dense computation. We apply our technique to MobileNetV2 and in the process reduce computation time by 22% and incur <1% accuracy degradation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes