CVJul 21, 2022

Efficient CNN Architecture Design Guided by Visualization

arXiv:2207.10318v17 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses efficiency bottlenecks in CNN design for computer vision applications, offering incremental improvements over existing methods.

The paper tackled the problem of improving parameter efficiency and inference speed in CNNs by introducing design guidelines based on visualizing feature maps and convolution kernels, resulting in VGNetG achieving better accuracy and lower latency with 30-50% parameter reduction, e.g., 67.7% top-1 accuracy with 0.99M parameters on ImageNet.

Modern efficient Convolutional Neural Networks(CNNs) always use Depthwise Separable Convolutions(DSCs) and Neural Architecture Search(NAS) to reduce the number of parameters and the computational complexity. But some inherent characteristics of networks are overlooked. Inspired by visualizing feature maps and N$\times$N(N$>$1) convolution kernels, several guidelines are introduced in this paper to further improve parameter efficiency and inference speed. Based on these guidelines, our parameter-efficient CNN architecture, called \textit{VGNetG}, achieves better accuracy and lower latency than previous networks with about 30%$\thicksim$50% parameters reduction. Our VGNetG-1.0MP achieves 67.7% top-1 accuracy with 0.99M parameters and 69.2% top-1 accuracy with 1.14M parameters on ImageNet classification dataset. Furthermore, we demonstrate that edge detectors can replace learnable depthwise convolution layers to mix features by replacing the N$\times$N kernels with fixed edge detection kernels. And our VGNetF-1.5MP archives 64.4%(-3.2%) top-1 accuracy and 66.2%(-1.4%) top-1 accuracy with additional Gaussian kernels.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes