On Generation of Adversarial Examples using Convex Programming
This work addresses the vulnerability of deep learning models to adversarial attacks, offering a theoretical explanation and new algorithms, but it is incremental as it builds on existing adversarial methods.
The paper tackles the problem of generating adversarial examples for deep learning by proposing a convex programming framework based on perturbation analysis, which recovers existing methods and yields new algorithms with competitive fooling ratios in experiments.
It has been observed that deep learning architectures tend to make erroneous decisions with high reliability for particularly designed adversarial instances. In this work, we show that the perturbation analysis of these architectures provides a framework for generating adversarial instances by convex programming which, for classification tasks, is able to recover variants of existing non-adaptive adversarial methods. The proposed framework can be used for the design of adversarial noise under various desirable constraints and different types of networks. Moreover, this framework is capable of explaining various existing adversarial methods and can be used to derive new algorithms as well. We make use of these results to obtain novel algorithms. The experiments show the competitive performance of the obtained solutions, in terms of fooling ratio, when benchmarked with well-known adversarial methods.