CVCRLGApr 1, 2020

Towards Achieving Adversarial Robustness by Enforcing Feature Consistency Across Bit Planes

arXiv:2004.00306v140 citations
AI Analysis

This work addresses adversarial attacks for AI security by proposing a faster, more human-like approach that avoids computationally expensive adversarial training, though it is incremental as it builds on existing robustness methods.

The paper tackles the problem of adversarial robustness in deep neural networks by training them to rely on higher bit planes for coarse impressions and lower ones for refinement, achieving significant improvements in robustness compared to normally trained models without using adversarial samples.

As humans, we inherently perceive images based on their predominant features, and ignore noise embedded within lower bit planes. On the contrary, Deep Neural Networks are known to confidently misclassify images corrupted with meticulously crafted perturbations that are nearly imperceptible to the human eye. In this work, we attempt to address this problem by training networks to form coarse impressions based on the information in higher bit planes, and use the lower bit planes only to refine their prediction. We demonstrate that, by imposing consistency on the representations learned across differently quantized images, the adversarial robustness of networks improves significantly when compared to a normally trained model. Present state-of-the-art defenses against adversarial attacks require the networks to be explicitly trained using adversarial samples that are computationally expensive to generate. While such methods that use adversarial training continue to achieve the best results, this work paves the way towards achieving robustness without having to explicitly train on adversarial samples. The proposed approach is therefore faster, and also closer to the natural learning process in humans.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes