Minority Reports Defense: Defending Against Adversarial Patches
This addresses the problem of adversarial attacks in image classification for security-critical applications, offering a certified defense against patch-based threats.
The paper tackles the vulnerability of deep learning image classification to adversarial patch attacks by proposing a defense that partially occludes images around candidate patch locations, achieving certified security against patch attacks of a certain size on datasets like CIFAR-10, Fashion MNIST, and MNIST.
Deep learning image classification is vulnerable to adversarial attack, even if the attacker changes just a small patch of the image. We propose a defense against patch attacks based on partially occluding the image around each candidate patch location, so that a few occlusions each completely hide the patch. We demonstrate on CIFAR-10, Fashion MNIST, and MNIST that our defense provides certified security against patch attacks of a certain size.