LG CR MLNov 1, 2018

Spectral Signatures in Backdoor Attacks

Brandon Tran, Jerry Li, Aleksander Madry

arXiv:1811.00636v140.3972 citations

Originality Incremental advance

AI Analysis

This work addresses the security issue of backdoor attacks in machine learning systems, which is critical for ensuring reliable AI deployments, and it is incremental as it builds on existing knowledge of such attacks.

The paper tackles the problem of detecting and removing poisoned examples in backdoor attacks on neural networks by identifying a new property called spectral signatures, which enables the use of robust statistical tools and demonstrates efficacy on real image sets and state-of-the-art architectures.

A recent line of work has uncovered a new form of data poisoning: so-called \emph{backdoor} attacks. These attacks are particularly dangerous because they do not affect a network's behavior on typical, benign data. Rather, the network only deviates from its expected output when triggered by a perturbation planted by an adversary. In this paper, we identify a new property of all known backdoor attacks, which we call \emph{spectral signatures}. This property allows us to utilize tools from robust statistics to thwart the attacks. We demonstrate the efficacy of these signatures in detecting and removing poisoned examples on real image sets and state of the art neural network architectures. We believe that understanding spectral signatures is a crucial first step towards designing ML systems secure against such backdoor attacks

View on arXiv PDF

Similar