LGMar 3, 2018

Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples

arXiv:1803.01128v3268 citations
Originality Incremental advance
AI Analysis

This addresses the robustness evaluation of seq2seq models for NLP applications, though it is incremental as it adapts adversarial attack techniques from image classification to text domains.

The paper tackled the problem of crafting adversarial examples for sequence-to-sequence models, which are challenging due to discrete inputs and vast output spaces, and achieved high success rates by changing fewer than 3 words to produce desired outputs in tasks like machine translation and text summarization.

Crafting adversarial examples has become an important technique to evaluate the robustness of deep neural networks (DNNs). However, most existing works focus on attacking the image classification problem since its input space is continuous and output space is finite. In this paper, we study the much more challenging problem of crafting adversarial examples for sequence-to-sequence (seq2seq) models, whose inputs are discrete text strings and outputs have an almost infinite number of possibilities. To address the challenges caused by the discrete input space, we propose a projected gradient method combined with group lasso and gradient regularization. To handle the almost infinite output space, we design some novel loss functions to conduct non-overlapping attack and targeted keyword attack. We apply our algorithm to machine translation and text summarization tasks, and verify the effectiveness of the proposed algorithm: by changing less than 3 words, we can make seq2seq model to produce desired outputs with high success rates. On the other hand, we recognize that, compared with the well-evaluated CNN-based classifiers, seq2seq models are intrinsically more robust to adversarial attacks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes