LGAICRMLJun 23, 2019

Defending Against Adversarial Examples with K-Nearest Neighbor

arXiv:1906.09525v229 citations
Originality Incremental advance
AI Analysis

This addresses the robustness issue for machine learning models vulnerable to adversarial attacks, representing an incremental improvement over existing defenses.

The paper tackles the problem of defending neural networks against adversarial examples by using k-nearest neighbor on intermediate activations, achieving state-of-the-art robustness on MNIST and CIFAR-10 with mean perturbation norms of 3.07 and 2.30, respectively.

Robustness is an increasingly important property of machine learning models as they become more and more prevalent. We propose a defense against adversarial examples based on a k-nearest neighbor (kNN) on the intermediate activation of neural networks. Our scheme surpasses state-of-the-art defenses on MNIST and CIFAR-10 against l2-perturbation by a significant margin. With our models, the mean perturbation norm required to fool our MNIST model is 3.07 and 2.30 on CIFAR-10. Additionally, we propose a simple certifiable lower bound on the l2-norm of the adversarial perturbation using a more specific version of our scheme, a 1-NN on representations learned by a Lipschitz network. Our model provides a nontrivial average lower bound of the perturbation norm, comparable to other schemes on MNIST with similar clean accuracy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes