LGCRCVJan 23, 2021

Online Adversarial Purification based on Self-Supervision

arXiv:2101.09387v163 citations
Originality Highly original
AI Analysis

This addresses the problem of adversarial attacks for machine learning security, offering a novel defense method that is robust even when adversaries know the defense strategy.

The paper tackles the vulnerability of deep neural networks to adversarial examples by introducing Self-supervised Online Adversarial Purification (SOAP), a defense strategy that uses self-supervised loss to purify adversarial examples at test-time, achieving competitive robust accuracy with less training complexity.

Deep neural networks are known to be vulnerable to adversarial examples, where a perturbation in the input space leads to an amplified shift in the latent network representation. In this paper, we combine canonical supervised learning with self-supervised representation learning, and present Self-supervised Online Adversarial Purification (SOAP), a novel defense strategy that uses a self-supervised loss to purify adversarial examples at test-time. Our approach leverages the label-independent nature of self-supervised signals and counters the adversarial perturbation with respect to the self-supervised tasks. SOAP yields competitive robust accuracy against state-of-the-art adversarial training and purification methods, with considerably less training complexity. In addition, our approach is robust even when adversaries are given knowledge of the purification defense strategy. To the best of our knowledge, our paper is the first that generalizes the idea of using self-supervised signals to perform online test-time purification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes