LGNov 4, 2020

Detecting Backdoors in Neural Networks Using Novel Feature-Based Anomaly Detection

arXiv:2011.02526v115 citations
AI Analysis

This addresses security vulnerabilities in neural networks for applications like autonomous systems, though it is incremental as it builds on existing anomaly detection concepts.

The paper tackles the problem of detecting backdoor attacks in neural networks by proposing a feature-based anomaly detection method, achieving results that evade state-of-the-art defenses across various trigger types and conditions.

This paper proposes a new defense against neural network backdooring attacks that are maliciously trained to mispredict in the presence of attacker-chosen triggers. Our defense is based on the intuition that the feature extraction layers of a backdoored network embed new features to detect the presence of a trigger and the subsequent classification layers learn to mispredict when triggers are detected. Therefore, to detect backdoors, the proposed defense uses two synergistic anomaly detectors trained on clean validation data: the first is a novelty detector that checks for anomalous features, while the second detects anomalous mappings from features to outputs by comparing with a separate classifier trained on validation data. The approach is evaluated on a wide range of backdoored networks (with multiple variations of triggers) that successfully evade state-of-the-art defenses. Additionally, we evaluate the robustness of our approach on imperceptible perturbations, scalability on large-scale datasets, and effectiveness under domain shift. This paper also shows that the defense can be further improved using data augmentation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes