LG AI CVMar 22, 2021

Adversarial Feature Augmentation and Normalization for Visual Recognition

Tianlong Chen, Yu Cheng, Zhe Gan, Jianfeng Wang, Lijuan Wang, Zhangyang Wang, Jingjing Liu

arXiv:2103.12171v113.621 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses generalization issues in computer vision for researchers and practitioners, offering an efficient alternative to computationally expensive adversarial data augmentation methods.

The paper tackles the problem of improving generalization in visual recognition models by proposing Adversarial Feature Augmentation and Normalization (A-FAN), which applies adversarial perturbations to intermediate feature embeddings instead of pixel-level inputs, resulting in consistent performance gains across classification, detection, and segmentation tasks on datasets like CIFAR-10, ImageNet, and COCO2017.

Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models. Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings, instead of relying on computationally-expensive pixel-level perturbations. We propose Adversarial Feature Augmentation and Normalization (A-FAN), which (i) first augments visual recognition models with adversarial features that integrate flexible scales of perturbation strengths, (ii) then extracts adversarial feature statistics from batch normalization, and re-injects them into clean features through feature normalization. We validate the proposed approach across diverse visual recognition tasks with representative backbone networks, including ResNets and EfficientNets for classification, Faster-RCNN for detection, and Deeplab V3+ for segmentation. Extensive experiments show that A-FAN yields consistent generalization improvement over strong baselines across various datasets for classification, detection and segmentation tasks, such as CIFAR-10, CIFAR-100, ImageNet, Pascal VOC2007, Pascal VOC2012, COCO2017, and Cityspaces. Comprehensive ablation studies and detailed analyses also demonstrate that adding perturbations to specific modules and layers of classification/detection/segmentation backbones yields optimal performance. Codes and pre-trained models will be made available at: https://github.com/VITA-Group/CV_A-FAN.

View on arXiv PDF Code

Similar