CLAINov 12, 2024

Chain Association-based Attacking and Shielding Natural Language Processing Systems

arXiv:2411.07843v1h-index: 1ACML
Originality Incremental advance
AI Analysis

This addresses security risks in NLP systems for developers and users, but it is incremental as it builds on existing adversarial attack paradigms.

The paper tackles the vulnerability of natural language processing systems, including large language models, to adversarial attacks based on chain associations, showing that these models are susceptible while humans remain robust, with experiments demonstrating attack success rates of up to 90% on certain tasks. It also explores shielding methods like adversarial training and associative graph-based recovery to mitigate such attacks.

Association as a gift enables people do not have to mention something in completely straightforward words and allows others to understand what they intend to refer to. In this paper, we propose a chain association-based adversarial attack against natural language processing systems, utilizing the comprehension gap between humans and machines. We first generate a chain association graph for Chinese characters based on the association paradigm for building search space of potential adversarial examples. Then, we introduce an discrete particle swarm optimization algorithm to search for the optimal adversarial examples. We conduct comprehensive experiments and show that advanced natural language processing models and applications, including large language models, are vulnerable to our attack, while humans appear good at understanding the perturbed text. We also explore two methods, including adversarial training and associative graph-based recovery, to shield systems from chain association-based attack. Since a few examples that use some derogatory terms, this paper contains materials that may be offensive or upsetting to some people.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes