LG MLMay 3, 2018

Siamese networks for generating adversarial examples

arXiv:1805.01431v12.22 citations

Originality Incremental advance

AI Analysis

This addresses the vulnerability of machine learning models to adversarial attacks, particularly in black-box scenarios, but is incremental as it builds on existing adversarial example generation methods.

The paper tackles the problem of generating adversarial examples for machine learning models without knowledge of the target data distribution, using a Siamese network trained on mismatched datasets, and demonstrates effectiveness on MNIST, CIFAR-10, and ImageNet targets with query datasets like TinyImageNet and Food-101.

Machine learning models are vulnerable to adversarial examples. An adversary modifies the input data such that humans still assign the same label, however, machine learning models misclassify it. Previous approaches in the literature demonstrated that adversarial examples can even be generated for the remotely hosted model. In this paper, we propose a Siamese network based approach to generate adversarial examples for a multiclass target CNN. We assume that the adversary do not possess any knowledge of the target data distribution, and we use an unlabeled mismatched dataset to query the target, e.g., for the ResNet-50 target, we use the Food-101 dataset as the query. Initially, the target model assigns labels to the query dataset, and a Siamese network is trained on the image pairs derived from these multiclass labels. We learn the \emph{adversarial perturbations} for the Siamese model and show that these perturbations are also adversarial w.r.t. the target model. In experimental results, we demonstrate effectiveness of our approach on MNIST, CIFAR-10 and ImageNet targets with TinyImageNet/Food-101 query datasets.

View on arXiv PDF

Similar