CLMar 11

PivotAttack: Rethinking the Search Trajectory in Hard-Label Text Attacks via Pivot Words

Yuzhi Liang, Shiliang Xiao, Jingsong Wei, Qiliang Lin, Xia Li

arXiv:2603.10842v14.9h-index: 10

Predicted impact top 35% in CL · last 90 daysOriginality Highly original

AI Analysis

This work addresses the challenge of query-efficient adversarial attacks for text models, offering a novel method that is incremental but shows strong performance gains.

The paper tackles the problem of inefficient search strategies in hard-label text attacks by proposing PivotAttack, an 'inside-out' framework that uses a Multi-Armed Bandit algorithm to identify pivot words, resulting in higher attack success rates and improved query efficiency compared to state-of-the-art baselines.

Existing hard-label text attacks often rely on inefficient "outside-in" strategies that traverse vast search spaces. We propose PivotAttack, a query-efficient "inside-out" framework. It employs a Multi-Armed Bandit algorithm to identify Pivot Sets-combinatorial token groups acting as prediction anchors-and strategically perturbs them to induce label flips. This approach captures inter-word dependencies and minimizes query costs. Extensive experiments across traditional models and Large Language Models demonstrate that PivotAttack consistently outperforms state-of-the-art baselines in both Attack Success Rate and query efficiency.

View on arXiv PDF

Similar