CVNov 16, 2016

On the Exploration of Convolutional Fusion Networks for Visual Recognition

arXiv:1611.05503v125 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency and performance issues in visual recognition for computer vision applications, presenting an incremental improvement with consistent gains across multiple tasks.

The paper tackles the problem of inefficient multi-scale deep representations in visual recognition by proposing convolutional fusion networks (CFN), which use 1x1 convolution and global average pooling to add few parameters while achieving remarkable improvements on CIFAR and ImageNet datasets over plain CNNs.

Despite recent advances in multi-scale deep representations, their limitations are attributed to expensive parameters and weak fusion modules. Hence, we propose an efficient approach to fuse multi-scale deep representations, called convolutional fusion networks (CFN). Owing to using 1$\times$1 convolution and global average pooling, CFN can efficiently generate the side branches while adding few parameters. In addition, we present a locally-connected fusion module, which can learn adaptive weights for the side branches and form a discriminatively fused feature. CFN models trained on the CIFAR and ImageNet datasets demonstrate remarkable improvements over the plain CNNs. Furthermore, we generalize CFN to three new tasks, including scene recognition, fine-grained recognition and image retrieval. Our experiments show that it can obtain consistent improvements towards the transferring tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes