CRCVLGJun 24, 2020

Blacklight: Scalable Defense for Neural Networks against Query-Based Black-Box Attacks

arXiv:2006.14042v362 citations
Originality Highly original
AI Analysis

This addresses the vulnerability of deep learning systems to practical black-box attacks on ML-as-a-service platforms, offering a scalable defense.

The authors tackled the problem of defending neural networks against query-based black-box adversarial attacks by proposing Blacklight, which detects attacks by identifying highly similar queries using probabilistic content fingerprints, and it successfully prevented all eight state-of-the-art attacks, often after only a few queries.

Deep learning systems are known to be vulnerable to adversarial examples. In particular, query-based black-box attacks do not require knowledge of the deep learning model, but can compute adversarial examples over the network by submitting queries and inspecting returns. Recent work largely improves the efficiency of those attacks, demonstrating their practicality on today's ML-as-a-service platforms. We propose Blacklight, a new defense against query-based black-box adversarial attacks. The fundamental insight driving our design is that, to compute adversarial examples, these attacks perform iterative optimization over the network, producing image queries highly similar in the input space. Blacklight detects query-based black-box attacks by detecting highly similar queries, using an efficient similarity engine operating on probabilistic content fingerprints. We evaluate Blacklight against eight state-of-the-art attacks, across a variety of models and image classification tasks. Blacklight identifies them all, often after only a handful of queries. By rejecting all detected queries, Blacklight prevents any attack to complete, even when attackers persist to submit queries after account ban or query rejection. Blacklight is also robust against several powerful countermeasures, including an optimal black-box attack that approximates white-box attacks in efficiency. Finally, we illustrate how Blacklight generalizes to other domains like text classification.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes