CLAIMay 7, 2022

A Simple Yet Efficient Method for Adversarial Word-Substitute Attack

arXiv:2206.05015v1h-index: 6
Originality Incremental advance
AI Analysis

This work addresses the cost-efficiency of adversarial attacks in NLP, which is an incremental improvement over existing methods.

The paper tackles the problem of reducing the number of queries needed for adversarial word-substitute attacks on text classification models, achieving a 3-30 times reduction in average adversarial queries while maintaining attack effectiveness.

NLP researchers propose different word-substitute black-box attacks that can fool text classification models. In such attack, an adversary keeps sending crafted adversarial queries to the target model until it can successfully achieve the intended outcome. State-of-the-art attack methods usually require hundreds or thousands of queries to find one adversarial example. In this paper, we study whether a sophisticated adversary can attack the system with much less queries. We propose a simple yet efficient method that can reduce the average number of adversarial queries by 3-30 times and maintain the attack effectiveness. This research highlights that an adversary can fool a deep NLP model with much less cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes