LGAICVMLFeb 8, 2021

Efficient Certified Defenses Against Patch Attacks on Image Classifiers

arXiv:2102.04154v148 citations
Originality Incremental advance
AI Analysis

This work is significant for autonomous systems in safety-critical domains, providing a fail-safe fallback component with certifiable robustness against patch attacks.

This paper addresses the threat of adversarial patches to image classifiers, which are a realistic concern for physical world attacks on autonomous systems. The authors propose BagCert, a new model architecture and certification procedure that allows for efficient certification, achieving 86% clean accuracy and 60% certified accuracy against 5x5 patches on CIFAR10, certifying 10,000 examples in 43 seconds.

Adversarial patches pose a realistic threat model for physical world attacks on autonomous systems via their perception component. Autonomous systems in safety-critical domains such as automated driving should thus contain a fail-safe fallback component that combines certifiable robustness against patches with efficient inference while maintaining high performance on clean inputs. We propose BagCert, a novel combination of model architecture and certification procedure that allows efficient certification. We derive a loss that enables end-to-end optimization of certified robustness against patches of different sizes and locations. On CIFAR10, BagCert certifies 10.000 examples in 43 seconds on a single GPU and obtains 86% clean and 60% certified accuracy against 5x5 patches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes