CL LGSep 15, 2019

Natural Language Adversarial Defense through Synonym Encoding

Xiaosen Wang, Hao Jin, Yichen Yang, Kun He

arXiv:1909.06723v47.790 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of adversarial robustness in NLP for practitioners, offering an incremental improvement by providing a scalable defense method without modifying network architecture or adding data.

The paper tackles the vulnerability of deep learning models in NLP to synonym substitution attacks by proposing the Synonym Encoding Method (SEM), which inserts an encoder to map synonym clusters to unique encodings, resulting in effective defense against attacks and blocking adversarial example transferability as demonstrated in experiments.

In the area of natural language processing, deep learning models are recently known to be vulnerable to various types of adversarial perturbations, but relatively few works are done on the defense side. Especially, there exists few effective defense method against the successful synonym substitution based attacks that preserve the syntactic structure and semantic information of the original text while fooling the deep learning models. We contribute in this direction and propose a novel adversarial defense method called Synonym Encoding Method (SEM). Specifically, SEM inserts an encoder before the input layer of the target model to map each cluster of synonyms to a unique encoding and trains the model to eliminate possible adversarial perturbations without modifying the network architecture or adding extra data. Extensive experiments demonstrate that SEM can effectively defend the current synonym substitution based attacks and block the transferability of adversarial examples. SEM is also easy and efficient to scale to large models and big datasets.

View on arXiv PDF Code

Similar