CRApr 26

Prototype-Guided Robust Learning against Backdoor Attacks

Wei Guo, Maura Pintor, Ambra Demontis, Battista Biggio

arXiv:2509.0874815.1h-index: 48

Predicted impact top 32% in CR · last 90 daysOriginality Incremental advance

AI Analysis

It provides a practical and generalizable defense for backdoor attacks, a critical security problem in machine learning, without requiring high poisoning ratios or specific attack assumptions.

PGRL defends against backdoor attacks using only a small set of clean samples, achieving superior robustness across diverse architectures and attack scenarios compared to eight existing defenses.

Backdoor attacks poison the training data, causing the model to behave normally on clean inputs but predict attacker-chosen labels when trigger patterns are embedded into the input samples. Defending against such attacks is highly challenging, especially when the defender has limited access to clean data. Existing defense methods often rely on restrictive assumptions-such as high poisoning ratios or poisoning strategies-limiting their practicality and generalization. To overcome these limitations, we propose Prototype-Guided Robust Learning (PGRL), a defense that only requires a small set of verified benign samples, and integrates two complementary components during fine-tuning: Label Consistency Verification (LCV), which detects and removes suspicious samples from the potentially poisoned dataset; and Feature Distance Estimation (FDE), which enforces the unlearning of backdoor-related representations. Extensive experiments against eight existing defenses show that PGRL achieves superior robustness across diverse architectures, datasets, and advanced attack scenarios, establishing a new standard for practical and generalizable backdoor defense.

View on arXiv PDF

Similar