CV CR LGJul 26, 2019

On the Design of Black-box Adversarial Examples by Leveraging Gradient-free Optimization and Operator Splitting Method

Pu Zhao, Sijia Liu, Pin-Yu Chen, Nghia Hoang, Kaidi Xu, Bhavya Kailkhura, Xue Lin

arXiv:1907.11684v415.761 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of efficient and robust adversarial attacks for machine learning security, offering incremental improvements in query efficiency and flexibility over existing methods.

The authors tackled the problem of high query complexity and restrictive threat model assumptions in black-box adversarial attacks by introducing a general framework based on ADMM integrated with zeroth-order and Bayesian optimization, resulting in ZO-ADMM and BO-ADMM methods that achieve much lower query complexities while maintaining competitive attack success rates in image classification datasets.

Robust machine learning is currently one of the most prominent topics which could potentially help shaping a future of advanced AI platforms that not only perform well in average cases but also in worst cases or adverse situations. Despite the long-term vision, however, existing studies on black-box adversarial attacks are still restricted to very specific settings of threat models (e.g., single distortion metric and restrictive assumption on target model's feedback to queries) and/or suffer from prohibitively high query complexity. To push for further advances in this field, we introduce a general framework based on an operator splitting method, the alternating direction method of multipliers (ADMM) to devise efficient, robust black-box attacks that work with various distortion metrics and feedback settings without incurring high query complexity. Due to the black-box nature of the threat model, the proposed ADMM solution framework is integrated with zeroth-order (ZO) optimization and Bayesian optimization (BO), and thus is applicable to the gradient-free regime. This results in two new black-box adversarial attack generation methods, ZO-ADMM and BO-ADMM. Our empirical evaluations on image classification datasets show that our proposed approaches have much lower function query complexities compared to state-of-the-art attack methods, but achieve very competitive attack success rates.

View on arXiv PDF Code

Similar