CRCVMay 29

BadBone: Backdoor Attacks Against Backbone Models in Visual Prompt Learning

arXiv:2605.3124689.8
AI Analysis

This work identifies a critical security vulnerability in prompt learning for machine learning practitioners, demonstrating that existing defenses are ineffective against this new attack vector.

This paper introduces BadBone, a novel backdoor attack targeting backbone models in visual prompt learning, designed to compromise only downstream tasks utilizing prompt learning. The attack achieves high performance while preserving model utility across pre-training and downstream tasks, and it successfully evades six state-of-the-art model-level defenses.

Prompt learning is a new machine learning paradigm that has attracted ample attention due to its simplicity and proven efficacy. Despite its growing adoption, the security vulnerabilities associated with this paradigm remain underexplored. In this work, we take the first step to propose BadBone, a stealthy and adaptive backdoor attack against prompt learning using bi-level optimization. Instead of backdooring the prompt learning process, we aim to compromise a backbone model such that only target downstream tasks employing prompt learning inherit the backdoor vulnerability. Extensive experiments on three different models and three datasets from various domains show that our targeted/untargeted backdoored models achieve high attack performance while maintaining utility on both pre-training and downstream tasks. Moreover, we evaluate our approach against six state-of-the-art model-level defenses, including Neural Cleanse, ABS, MNTD, NAD, CLP, and D-BR. The results demonstrate that these defenses are largely ineffective against our backdoored models and thus leave the effective defense as an important direction for future work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes