LGCRMLNov 1, 2018

Spectral Signatures in Backdoor Attacks

arXiv:1811.00636v1972 citations
Originality Incremental advance
AI Analysis

This work addresses the security issue of backdoor attacks in machine learning systems, which is critical for ensuring reliable AI deployments, and it is incremental as it builds on existing knowledge of such attacks.

The paper tackles the problem of detecting and removing poisoned examples in backdoor attacks on neural networks by identifying a new property called spectral signatures, which enables the use of robust statistical tools and demonstrates efficacy on real image sets and state-of-the-art architectures.

A recent line of work has uncovered a new form of data poisoning: so-called \emph{backdoor} attacks. These attacks are particularly dangerous because they do not affect a network's behavior on typical, benign data. Rather, the network only deviates from its expected output when triggered by a perturbation planted by an adversary. In this paper, we identify a new property of all known backdoor attacks, which we call \emph{spectral signatures}. This property allows us to utilize tools from robust statistics to thwart the attacks. We demonstrate the efficacy of these signatures in detecting and removing poisoned examples on real image sets and state of the art neural network architectures. We believe that understanding spectral signatures is a crucial first step towards designing ML systems secure against such backdoor attacks

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes