LG AI CVOct 17, 2024

Golyadkin's Torment: Doppelgängers and Adversarial Vulnerability

arXiv:2410.13193v12.6

Originality Highly original

AI Analysis

This work addresses reliability and security issues in machine learning systems by defining and analyzing adversarial vulnerabilities that affect many classifiers.

The paper investigates adversarial Doppelgangers, a type of adversarial input distinct from typical examples, and finds that most classifiers are vulnerable to them, with robustness-accuracy trade-offs often ineffective and some classification problems inherently ambiguous.

Many machine learning (ML) classifiers are claimed to outperform humans, but they still make mistakes that humans do not. The most notorious examples of such mistakes are adversarial visual metamers. This paper aims to define and investigate the phenomenon of adversarial Doppelgangers (AD), which includes adversarial visual metamers, and to compare the performance and robustness of ML classifiers to human performance. We find that AD are inputs that are close to each other with respect to a perceptual metric defined in this paper. AD are qualitatively different from the usual adversarial examples. The vast majority of classifiers are vulnerable to AD and robustness-accuracy trade-offs may not improve them. Some classification problems may not admit any AD robust classifiers because the underlying classes are ambiguous. We provide criteria that can be used to determine whether a classification problem is well defined or not; describe the structure and attributes of an AD-robust classifier; introduce and explore the notions of conceptual entropy and regions of conceptual ambiguity for classifiers that are vulnerable to AD attacks, along with methods to bound the AD fooling rate of an attack. We define the notion of classifiers that exhibit hypersensitive behavior, that is, classifiers whose only mistakes are adversarial Doppelgangers. Improving the AD robustness of hyper-sensitive classifiers is equivalent to improving accuracy. We identify conditions guaranteeing that all classifiers with sufficiently high accuracy are hyper-sensitive. Our findings are aimed at significant improvements in the reliability and security of machine learning systems.

View on arXiv PDF

Similar