CLCRLGMar 1, 2021

Token-Modification Adversarial Attacks for Natural Language Processing: A Survey

arXiv:2103.00676v322 citations
Originality Synthesis-oriented
AI Analysis

This survey addresses the need for systematic understanding and comparison of adversarial attacks in NLP for researchers and practitioners, but it is incremental as it organizes existing knowledge rather than introducing new methods.

The paper tackles the problem of categorizing token-modification adversarial attacks in NLP by proposing a framework that breaks them down into four components: goal function, allowable transformations, search method, and constraints, aiming to provide a comprehensive guide for newcomers and inspire targeted research.

Many adversarial attacks target natural language processing systems, most of which succeed through modifying the individual tokens of a document. Despite the apparent uniqueness of each of these attacks, fundamentally they are simply a distinct configuration of four components: a goal function, allowable transformations, a search method, and constraints. In this survey, we systematically present the different components used throughout the literature, using an attack-independent framework which allows for easy comparison and categorisation of components. Our work aims to serve as a comprehensive guide for newcomers to the field and to spark targeted research into refining the individual attack components.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes