ML CR LGOct 27, 2019

Understanding and Quantifying Adversarial Examples Existence in Linear Classification

arXiv:1910.12163v15.74 citations

Originality Incremental advance

AI Analysis

This work addresses the vulnerability of machine learning models to adversarial attacks, offering a more practical definition that could guide the design of robust systems, though it is incremental as it builds on existing mathematical frameworks.

The paper tackles the problem of adversarial examples in linear classifiers by proposing a new definition of strong adversarial examples that separately limits perturbation along signal direction, showing that linear classifiers can be made robust to such attacks where previous definitions failed, with quantitative formulas confirmed by numerical experiments using SVM.

State-of-art deep neural networks (DNN) are vulnerable to attacks by adversarial examples: a carefully designed small perturbation to the input, that is imperceptible to human, can mislead DNN. To understand the root cause of adversarial examples, we quantify the probability of adversarial example existence for linear classifiers. Previous mathematical definition of adversarial examples only involves the overall perturbation amount, and we propose a more practical relevant definition of strong adversarial examples that separately limits the perturbation along the signal direction also. We show that linear classifiers can be made robust to strong adversarial examples attack in cases where no adversarial robust linear classifiers exist under the previous definition. The quantitative formulas are confirmed by numerical experiments using a linear support vector machine (SVM) classifier. The results suggest that designing general strong-adversarial-robust learning systems is feasible but only through incorporating human knowledge of the underlying classification problem.

View on arXiv PDF

Similar