LGMLOct 10, 2019

Defending Neural Backdoors via Generative Distribution Modeling

arXiv:1910.04749v2201 citations
AI Analysis

This addresses a security threat in deep learning for applications requiring robust models, but it is incremental as it builds on existing defense methods by improving trigger distribution modeling.

The paper tackles the problem of defending against neural backdoor attacks in deep learning by modeling the distribution of possible triggers, proposing the MESA algorithm for generative modeling and a defense technique to remove triggers, achieving effective results on CIFAR-10/100 datasets.

Neural backdoor attack is emerging as a severe security threat to deep learning, while the capability of existing defense methods is limited, especially for complex backdoor triggers. In the work, we explore the space formed by the pixel values of all possible backdoor triggers. An original trigger used by an attacker to build the backdoored model represents only a point in the space. It then will be generalized into a distribution of valid triggers, all of which can influence the backdoored model. Thus, previous methods that model only one point of the trigger distribution is not sufficient. Getting the entire trigger distribution, e.g., via generative modeling, is a key to effective defense. However, existing generative modeling techniques for image generation are not applicable to the backdoor scenario as the trigger distribution is completely unknown. In this work, we propose max-entropy staircase approximator (MESA), an algorithm for high-dimensional sampling-free generative modeling and use it to recover the trigger distribution. We also develop a defense technique to remove the triggers from the backdoored model. Our experiments on Cifar10/100 dataset demonstrate the effectiveness of MESA in modeling the trigger distribution and the robustness of the proposed defense method.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes