CVMar 15, 2021

Revisiting Dynamic Convolution via Matrix Decomposition

arXiv:2103.08756v183 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in efficient CNNs for computer vision, offering an incremental improvement over existing dynamic convolution methods.

The paper tackles the parameter inefficiency and optimization difficulty in dynamic convolution by proposing dynamic channel fusion, which reduces parameters by 2.5 times while maintaining accuracy on ImageNet.

Recent research in dynamic convolution shows substantial performance boost for efficient CNNs, due to the adaptive aggregation of K static convolution kernels. It has two limitations: (a) it increases the number of convolutional weights by K-times, and (b) the joint optimization of dynamic attention and static convolution kernels is challenging. In this paper, we revisit it from a new perspective of matrix decomposition and reveal the key issue is that dynamic convolution applies dynamic attention over channel groups after projecting into a higher dimensional latent space. To address this issue, we propose dynamic channel fusion to replace dynamic attention over channel groups. Dynamic channel fusion not only enables significant dimension reduction of the latent space, but also mitigates the joint optimization difficulty. As a result, our method is easier to train and requires significantly fewer parameters without sacrificing accuracy. Source code is at https://github.com/liyunsheng13/dcd.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes