CRAILGMay 17

Fast and Lightweight Backdoor Detection via Head Random Probing

arXiv:2605.1890865.4
Predicted impact top 25% in CR · last 90 daysOriginality Incremental advance
AI Analysis

Provides a fast, data-free backdoor detection method for practical model auditing, addressing the need for efficient and robust post-training detection without clean data or gradients.

HTell detects backdoors in DNNs by probing the prediction head with random latent vectors, achieving 99.03% true positive rate and 2.11% false positive rate with 12.69 ms/model latency, reducing time cost by over 30,000× compared to gradient-based methods.

Deep neural networks (DNNs) remain critically vulnerable to backdoor attacks. Existing post-training detectors often require clean or surrogate data, gradients, or iterative trigger reconstruction, leading to high computational costs and limited robustness under practical model-auditing scenarios. In this paper, we propose HTell, a fast and lightweight data-free backdoor detector based on head random probing. Instead of reconstructing diverse trigger patterns, HTell inspects their unified manifestation in the prediction head: backdoored models tend to exhibit abnormal response concentration on the target class under random latent probes. HTell generates architecture-aware random latent probes, feeds them directly into the model head, and detects backdoors by analyzing class-wise response statistics, without accessing real or surrogate data, model gradients, or parameter optimization. We evaluate HTell on a large-scale benchmark containing more than 6,000 backdoored models and over 700 clean models, covering 4 datasets, 14 architectures, and 21 types of backdoor attacks. HTell achieves 99.03% true positive rate and 2.11% false positive rate with only 12.69 ms/model detection latency, reducing the time cost by over 30,000$\times$ compared with representative gradient-based detectors. These results demonstrate that head random probing provides an accurate, robust, and efficient solution for large-scale data-free backdoor model auditing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes