LGAug 5, 2021

BOSS: Bidirectional One-Shot Synthesis of Adversarial Examples

Ismail R. Alkhouri, Alvaro Velasquez, George K. Atia

arXiv:2108.02756v24.42 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of generating adversarial examples efficiently for machine learning security, though it is incremental as it builds on existing GAN-like methods.

The paper tackles the problem of synthesizing adversarial examples from scratch in a one-shot manner to induce arbitrary soft predictions from pre-trained models while maintaining high similarity to specified inputs, and demonstrates that the targeted and confidence reduction attacks developed perform on par with state-of-the-art algorithms.

The design of additive imperceptible perturbations to the inputs of deep classifiers to maximize their misclassification rates is a central focus of adversarial machine learning. An alternative approach is to synthesize adversarial examples from scratch using GAN-like structures, albeit with the use of large amounts of training data. By contrast, this paper considers one-shot synthesis of adversarial examples; the inputs are synthesized from scratch to induce arbitrary soft predictions at the output of pre-trained models, while simultaneously maintaining high similarity to specified inputs. To this end, we present a problem that encodes objectives on the distance between the desired and output distributions of the trained model and the similarity between such inputs and the synthesized examples. We prove that the formulated problem is NP-complete. Then, we advance a generative approach to the solution in which the adversarial examples are obtained as the output of a generative network whose parameters are iteratively updated by optimizing surrogate loss functions for the dual-objective. We demonstrate the generality and versatility of the framework and approach proposed through applications to the design of targeted adversarial attacks, generation of decision boundary samples, and synthesis of low confidence classification inputs. The approach is further extended to an ensemble of models with different soft output specifications. The experimental results verify that the targeted and confidence reduction attack methods developed perform on par with state-of-the-art algorithms.

View on arXiv PDF

Similar