CVDec 11, 2022

General Adversarial Defense Against Black-box Attacks via Pixel Level and Feature Level Distribution Alignments

Xiaogang Xu, Hengshuang Zhao, Philip Torr, Jiaya Jia

arXiv:2212.05387v15.76 citationsh-index: 117

Originality Incremental advance

AI Analysis

This addresses a security problem for AI systems by providing a general defense against transferable attacks, though it appears incremental as it builds on prior distribution alignment methods.

The paper tackles the vulnerability of Deep Neural Networks to black-box adversarial attacks by using Deep Generative Networks to align the distribution of adversarial and clean samples, achieving enhanced robustness across tasks like image classification, semantic segmentation, and object detection.

Deep Neural Networks (DNNs) are vulnerable to the black-box adversarial attack that is highly transferable. This threat comes from the distribution gap between adversarial and clean samples in feature space of the target DNNs. In this paper, we use Deep Generative Networks (DGNs) with a novel training mechanism to eliminate the distribution gap. The trained DGNs align the distribution of adversarial samples with clean ones for the target DNNs by translating pixel values. Different from previous work, we propose a more effective pixel level training constraint to make this achievable, thus enhancing robustness on adversarial samples. Further, a class-aware feature-level constraint is formulated for integrated distribution alignment. Our approach is general and applicable to multiple tasks, including image classification, semantic segmentation, and object detection. We conduct extensive experiments on different datasets. Our strategy demonstrates its unique effectiveness and generality against black-box attacks.

View on arXiv PDF

Similar