CLCRCVLGMar 27, 2019

Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems

arXiv:1903.11508v21141 citations
Originality Incremental advance
AI Analysis

This addresses a vulnerability in NLP systems for applications like social media moderation, but it is incremental as it builds on existing adversarial attack research.

The paper tackles the problem of visual adversarial attacks on NLP systems, such as obfuscated text, and shows that these attacks cause performance decreases of up to 82% in models, while humans remain robust. It explores shielding methods like visual character embeddings and adversarial training, which improve robustness but still lag behind non-attack performance.

Visual modifications to text are often used to obfuscate offensive comments in social media (e.g., "!d10t") or as a writing style ("1337" in "leet speak"), among other scenarios. We consider this as a new type of adversarial attack in NLP, a setting to which humans are very robust, as our experiments with both simple and more difficult visual input perturbations demonstrate. We then investigate the impact of visual adversarial attacks on current NLP systems on character-, word-, and sentence-level tasks, showing that both neural and non-neural models are, in contrast to humans, extremely sensitive to such attacks, suffering performance decreases of up to 82\%. We then explore three shielding methods---visual character embeddings, adversarial training, and rule-based recovery---which substantially improve the robustness of the models. However, the shielding methods still fall behind performances achieved in non-attack scenarios, which demonstrates the difficulty of dealing with visual attacks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes