LGAICLCRMLMay 31, 2018

Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data

arXiv:1805.12316v1122 citations
Originality Incremental advance
AI Analysis

This addresses adversarial robustness in text classification models, though it appears incremental as it builds on existing attack frameworks for discrete data.

The authors tackled the problem of generating adversarial examples for discrete data, introducing Greedy Attack and Gumbel Attack, which reduced the accuracy of character-based convolutional networks to random selection by modifying only five characters.

We present a probabilistic framework for studying adversarial attacks on discrete data. Based on this framework, we derive a perturbation-based method, Greedy Attack, and a scalable learning-based method, Gumbel Attack, that illustrate various tradeoffs in the design of attacks. We demonstrate the effectiveness of these methods using both quantitative metrics and human evaluation on various state-of-the-art models for text classification, including a word-based CNN, a character-based CNN and an LSTM. As as example of our results, we show that the accuracy of character-based convolutional networks drops to the level of random selection by modifying only five characters through Greedy Attack.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes