CVApr 22, 2019

Stochastic Region Pooling: Make Attention More Expressive

Mingnan Luo, Guihua Wen, Yang Hu, Dan Dai, Yingxue Xu

arXiv:1904.09853v13.48 citations

Originality Incremental advance

AI Analysis

This work addresses performance limitations in attention mechanisms for computer vision, offering a parameter-free improvement for efficient CNNs, though it is incremental in nature.

The paper tackles the problem of channel descriptor homogeneity in attention mechanisms by proposing Stochastic Region Pooling (SRP), which enhances detail distinction and achieves state-of-the-art results on image recognition datasets like CIFAR-10/100 and ImageNet.

Global Average Pooling (GAP) is used by default on the channel-wise attention mechanism to extract channel descriptors. However, the simple global aggregation method of GAP is easy to make the channel descriptors have homogeneity, which weakens the detail distinction between feature maps, thus affecting the performance of the attention mechanism. In this work, we propose a novel method for channel-wise attention network, called Stochastic Region Pooling (SRP), which makes the channel descriptors more representative and diversity by encouraging the feature map to have more or wider important feature responses. Also, SRP is the general method for the attention mechanisms without any additional parameters or computation. It can be widely applied to attention networks without modifying the network structure. Experimental results on image recognition datasets including CIAFR-10/100, ImageNet and three Fine-grained datasets (CUB-200-2011, Stanford Cars and Stanford Dogs) show that SRP brings the significant improvements of the performance over efficient CNNs and achieves the state-of-the-art results.

View on arXiv PDF

Similar