CLOct 6, 2022

Are Synonym Substitution Attacks Really Synonym Substitution Attacks?

arXiv:2210.02844v3225 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses a critical flaw in adversarial attack methods for NLP, revealing that widely used SSAs often fail to generate valid adversarial samples, which is important for researchers and practitioners in AI security.

The paper investigates whether synonym substitution attacks (SSAs) truly replace words with synonyms, finding that four common methods produce many invalid substitutions that are ungrammatical or alter semantics, and that current constraints for detecting these issues are highly insufficient.

In this paper, we explore the following question: Are synonym substitution attacks really synonym substitution attacks (SSAs)? We approach this question by examining how SSAs replace words in the original sentence and show that there are still unresolved obstacles that make current SSAs generate invalid adversarial samples. We reveal that four widely used word substitution methods generate a large fraction of invalid substitution words that are ungrammatical or do not preserve the original sentence's semantics. Next, we show that the semantic and grammatical constraints used in SSAs for detecting invalid word replacements are highly insufficient in detecting invalid adversarial samples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes