LGMLFeb 22, 2020

Robustness to Programmable String Transformations via Augmented Abstract Training

arXiv:2002.09579v417 citations
AI Analysis

This addresses the problem of adversarial robustness in NLP for practitioners, though it is incremental as it builds on existing adversarial training methods.

The paper tackles the vulnerability of deep neural networks in NLP to adversarial input perturbations by introducing a language for specifying string transformations and an adversarial training approach combining search and abstraction. The resulting models trained on AG and SST2 datasets show robustness to user-defined transformations mimicking spelling mistakes and meaning-preserving changes.

Deep neural networks for natural language processing tasks are vulnerable to adversarial input perturbations. In this paper, we present a versatile language for programmatically specifying string transformations -- e.g., insertions, deletions, substitutions, swaps, etc. -- that are relevant to the task at hand. We then present an approach to adversarially training models that are robust to such user-defined string transformations. Our approach combines the advantages of search-based techniques for adversarial training with abstraction-based techniques. Specifically, we show how to decompose a set of user-defined string transformations into two component specifications, one that benefits from search and another from abstraction. We use our technique to train models on the AG and SST2 datasets and show that the resulting models are robust to combinations of user-defined transformations mimicking spelling mistakes and other meaning-preserving transformations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes