CLNov 12, 2022

Generating Textual Adversaries with Minimal Perturbation

arXiv:2211.06571v1291 citationsh-index: 19
Originality Highly original
AI Analysis

This addresses the challenge of creating effective yet semantically faithful adversarial examples for text classification models, which is an incremental improvement over existing attack methods.

The paper tackles the problem of preserving semantic meaning when generating word-level adversarial attacks on text, developing a novel strategy that finds adversarial texts with minimal perturbation. Experiments show their approach achieves higher success rates and lower perturbation rates than state-of-the-art methods across four benchmark datasets.

Many word-level adversarial attack approaches for textual data have been proposed in recent studies. However, due to the massive search space consisting of combinations of candidate words, the existing approaches face the problem of preserving the semantics of texts when crafting adversarial counterparts. In this paper, we develop a novel attack strategy to find adversarial texts with high similarity to the original texts while introducing minimal perturbation. The rationale is that we expect the adversarial texts with small perturbation can better preserve the semantic meaning of original texts. Experiments show that, compared with state-of-the-art attack approaches, our approach achieves higher success rates and lower perturbation rates in four benchmark datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes