A New Type of Adversarial Examples
This addresses security concerns for machine learning systems by revealing a broader vulnerability, though it appears incremental as it builds on existing adversarial example concepts.
The paper tackles the problem of adversarial examples in machine learning by introducing a new type where examples are significantly different from originals but yield the same model output, using algorithms like NI-FGSM and NMI-FGSM, and finds these examples are widely distributed in the sample space.
Most machine learning models are vulnerable to adversarial examples, which poses security concerns on these models. Adversarial examples are crafted by applying subtle but intentionally worst-case modifications to examples from the dataset, leading the model to output a different answer from the original example. In this paper, adversarial examples are formed in an exactly opposite manner, which are significantly different from the original examples but result in the same answer. We propose a novel set of algorithms to produce such adversarial examples, including the negative iterative fast gradient sign method (NI-FGSM) and the negative iterative fast gradient method (NI-FGM), along with their momentum variants: the negative momentum iterative fast gradient sign method (NMI-FGSM) and the negative momentum iterative fast gradient method (NMI-FGM). Adversarial examples constructed by these methods could be used to perform an attack on machine learning systems in certain occasions. Moreover, our results show that the adversarial examples are not merely distributed in the neighbourhood of the examples from the dataset; instead, they are distributed extensively in the sample space.