CRLGFeb 1, 2020

Towards Sharper First-Order Adversary with Quantized Gradients

arXiv:2002.02372v14 citations
AI Analysis

This work addresses the need for more effective adversarial attacks in machine learning security, offering incremental improvements over current first-order methods.

The paper tackles the problem of improving first-order adversarial attacks by replacing sign gradients with quantized gradients, which preserve both sign and relative magnitude information, resulting in attacks that outperform existing methods, such as achieving 88.32% accuracy on a secret MNIST model.

Despite the huge success of Deep Neural Networks (DNNs) in a wide spectrum of machine learning and data mining tasks, recent research shows that this powerful tool is susceptible to maliciously crafted adversarial examples. Up until now, adversarial training has been the most successful defense against adversarial attacks. To increase adversarial robustness, a DNN can be trained with a combination of benign and adversarial examples generated by first-order methods. However, in state-of-the-art first-order attacks, adversarial examples with sign gradients retain the sign information of each gradient component but discard the relative magnitude between components. In this work, we replace sign gradients with quantized gradients. Gradient quantization not only preserves the sign information, but also keeps the relative magnitude between components. Experiments show white-box first-order attacks with quantized gradients outperform their variants with sign gradients on multiple datasets. Notably, our BLOB\_QG attack achieves an accuracy of $88.32\%$ on the secret MNIST model from the MNIST Challenge and it outperforms all other methods on the leaderboard of white-box attacks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes