CRLGNov 18, 2021

A Review of Adversarial Attack and Defense for Classification Methods

arXiv:2111.09961v199 citations
Originality Synthesis-oriented
AI Analysis

It addresses the safety concerns of applying machine learning in security-critical areas by summarizing existing work, but it is incremental as a review paper.

This paper reviews the vulnerability of classification methods, especially deep neural networks, to adversarial examples that can fool models while appearing normal to humans, and it surveys recent developments in attack and defense techniques to introduce the topic to the statistical community.

Despite the efficiency and scalability of machine learning systems, recent studies have demonstrated that many classification methods, especially deep neural networks (DNNs), are vulnerable to adversarial examples; i.e., examples that are carefully crafted to fool a well-trained classification model while being indistinguishable from natural data to human. This makes it potentially unsafe to apply DNNs or related methods in security-critical areas. Since this issue was first identified by Biggio et al. (2013) and Szegedy et al.(2014), much work has been done in this field, including the development of attack methods to generate adversarial examples and the construction of defense techniques to guard against such examples. This paper aims to introduce this topic and its latest developments to the statistical community, primarily focusing on the generation and guarding of adversarial examples. Computing codes (in python and R) used in the numerical experiments are publicly available for readers to explore the surveyed methods. It is the hope of the authors that this paper will encourage more statisticians to work on this important and exciting field of generating and defending against adversarial examples.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes