CR LGFeb 19, 2020

NNoculation: Catching BadNets in the Wild

Akshaj Kumar Veldanda, Kang Liu, Benjamin Tan, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri, Brendan Dolan-Gavitt, Siddharth Garg

arXiv:2002.08313v225.623 citationsHas Code

Originality Highly original

AI Analysis

This addresses the security issue of backdoor attacks in neural networks for AI practitioners, offering a robust defense with minimal assumptions.

The paper tackles the problem of defending against backdoored neural networks (BadNets) by proposing NNoculation, a two-stage defense that repairs networks pre-deployment and online, resulting in outperforming state-of-the-art defenses across a comprehensive suite of attacks.

This paper proposes a novel two-stage defense (NNoculation) against backdoored neural networks (BadNets) that, repairs a BadNet both pre-deployment and online in response to backdoored test inputs encountered in the field. In the pre-deployment stage, NNoculation retrains the BadNet with random perturbations of clean validation inputs to partially reduce the adversarial impact of a backdoor. Post-deployment, NNoculation detects and quarantines backdoored test inputs by recording disagreements between the original and pre-deployment patched networks. A CycleGAN is then trained to learn transformations between clean validation and quarantined inputs; i.e., it learns to add triggers to clean validation images. Backdoored validation images along with their correct labels are used to further retrain the pre-deployment patched network, yielding our final defense. Empirical evaluation on a comprehensive suite of backdoor attacks show that NNoculation outperforms all state-of-the-art defenses that make restrictive assumptions and only work on specific backdoor attacks, or fail on adaptive attacks. In contrast, NNoculation makes minimal assumptions and provides an effective defense, even under settings where existing defenses are ineffective due to attackers circumventing their restrictive assumptions.

View on arXiv PDF Code

Similar