CRAIAug 5, 2025

Selection-Based Vulnerabilities: Clean-Label Backdoor Attacks in Active Learning

arXiv:2508.05681v1h-index: 3
Originality Highly original
AI Analysis

This work reveals a critical vulnerability in active learning systems, warning users in resource-constrained scenarios to deploy it cautiously in trusted data environments.

The paper tackles the safety of active learning by introducing ALA, a framework that exploits acquisition functions to perform clean-label backdoor attacks, achieving success rates up to 94% with low poisoning budgets of 0.5%-1.0% while maintaining model utility and evading human detection.

Active learning(AL), which serves as the representative label-efficient learning paradigm, has been widely applied in resource-constrained scenarios. The achievement of AL is attributed to acquisition functions, which are designed for identifying the most important data to label. Despite this success, one question remains unanswered: is AL safe? In this work, we introduce ALA, a practical and the first framework to utilize the acquisition function as the poisoning attack surface to reveal the weakness of active learning. Specifically, ALA optimizes imperceptibly poisoned inputs to exhibit high uncertainty scores, increasing their probability of being selected by acquisition functions. To evaluate ALA, we conduct extensive experiments across three datasets, three acquisition functions, and two types of clean-label backdoor triggers. Results show that our attack can achieve high success rates (up to 94%) even under low poisoning budgets (0.5%-1.0%) while preserving model utility and remaining undetectable to human annotators. Our findings remind active learning users: acquisition functions can be easily exploited, and active learning should be deployed with caution in trusted data scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes