CLLGSep 15, 2019

Natural Language Adversarial Defense through Synonym Encoding

arXiv:1909.06723v489 citations
Originality Incremental advance
AI Analysis

This addresses the problem of adversarial robustness in NLP for practitioners, offering an incremental improvement by providing a scalable defense method without modifying network architecture or adding data.

The paper tackles the vulnerability of deep learning models in NLP to synonym substitution attacks by proposing the Synonym Encoding Method (SEM), which inserts an encoder to map synonym clusters to unique encodings, resulting in effective defense against attacks and blocking adversarial example transferability as demonstrated in experiments.

In the area of natural language processing, deep learning models are recently known to be vulnerable to various types of adversarial perturbations, but relatively few works are done on the defense side. Especially, there exists few effective defense method against the successful synonym substitution based attacks that preserve the syntactic structure and semantic information of the original text while fooling the deep learning models. We contribute in this direction and propose a novel adversarial defense method called Synonym Encoding Method (SEM). Specifically, SEM inserts an encoder before the input layer of the target model to map each cluster of synonyms to a unique encoding and trains the model to eliminate possible adversarial perturbations without modifying the network architecture or adding extra data. Extensive experiments demonstrate that SEM can effectively defend the current synonym substitution based attacks and block the transferability of adversarial examples. SEM is also easy and efficient to scale to large models and big datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes