CLAINov 29, 2023

TARGET: Template-Transferable Backdoor Attack Against Prompt-based NLP Models via GPT4

arXiv:2311.17429v16 citationsh-index: 3
Originality Incremental advance
AI Analysis

This addresses security risks in low-resource NLP applications, but it is an incremental improvement focusing on transferability and stealthiness in backdoor attacks.

The paper tackles the vulnerability of prompt-based NLP models to backdoor attacks by proposing TARGET, a method that uses GPT4 to generate transferable templates as triggers, achieving better attack performance and stealthiness compared to baselines on five datasets and three BERT models.

Prompt-based learning has been widely applied in many low-resource NLP tasks such as few-shot scenarios. However, this paradigm has been shown to be vulnerable to backdoor attacks. Most of the existing attack methods focus on inserting manually predefined templates as triggers in the pre-training phase to train the victim model and utilize the same triggers in the downstream task to perform inference, which tends to ignore the transferability and stealthiness of the templates. In this work, we propose a novel approach of TARGET (Template-trAnsfeRable backdoor attack aGainst prompt-basEd NLP models via GPT4), which is a data-independent attack method. Specifically, we first utilize GPT4 to reformulate manual templates to generate tone-strong and normal templates, and the former are injected into the model as a backdoor trigger in the pre-training phase. Then, we not only directly employ the above templates in the downstream task, but also use GPT4 to generate templates with similar tone to the above templates to carry out transferable attacks. Finally we have conducted extensive experiments on five NLP datasets and three BERT series models, with experimental results justifying that our TARGET method has better attack performance and stealthiness compared to the two-external baseline methods on direct attacks, and in addition achieves satisfactory attack capability in the unseen tone-similar templates.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes