Multi-view Feature Augmentation with Adaptive Class Activation Mapping
This work addresses robustness in image classification models, though it appears incremental as it builds on existing feature extraction methods like global average pooling.
The paper tackles the problem of limited feature diversity in image classification by proposing a multi-view feature augmentation module that extracts and ensembles diverse local features, achieving consistent and noticeable performance gains.
We propose an end-to-end-trainable feature augmentation module built for image classification that extracts and exploits multi-view local features to boost model performance. Different from using global average pooling (GAP) to extract vectorized features from only the global view, we propose to sample and ensemble diverse multi-view local features to improve model robustness. To sample class-representative local features, we incorporate a simple auxiliary classifier head (comprising only one 1$\times$1 convolutional layer) which efficiently and adaptively attends to class-discriminative local regions of feature maps via our proposed AdaCAM (Adaptive Class Activation Mapping). Extensive experiments demonstrate consistent and noticeable performance gains achieved by our multi-view feature augmentation module.