Frauds Bargain Attack: Generating Adversarial Text Samples via Word Manipulation Process
This addresses the vulnerability of NLP models to adversarial attacks, offering a more effective method for security testing, but it is incremental as it builds on existing adversarial example generation techniques.
The paper tackles the problem of generating adversarial text examples for NLP models by proposing the Fraud's Bargain Attack (FBA), which uses a randomization mechanism and Metropolis-Hasting sampler to achieve higher attack success rates, imperceptibility, and sentence quality compared to existing methods.
Recent research has revealed that natural language processing (NLP) models are vulnerable to adversarial examples. However, the current techniques for generating such examples rely on deterministic heuristic rules, which fail to produce optimal adversarial examples. In response, this study proposes a new method called the Fraud's Bargain Attack (FBA), which uses a randomization mechanism to expand the search space and produce high-quality adversarial examples with a higher probability of success. FBA uses the Metropolis-Hasting sampler, a type of Markov Chain Monte Carlo sampler, to improve the selection of adversarial examples from all candidates generated by a customized stochastic process called the Word Manipulation Process (WMP). The WMP method modifies individual words in a contextually-aware manner through insertion, removal, or substitution. Through extensive experiments, this study demonstrates that FBA outperforms other methods in terms of attack success rate, imperceptibility and sentence quality.