MLITLGMar 11, 2018

Combating Adversarial Attacks Using Sparse Representations

arXiv:1803.03880v326 citations
Originality Incremental advance
AI Analysis

This addresses the problem of adversarial vulnerabilities in machine learning models, offering a theoretical and experimental defense, though it appears incremental as it builds on known concepts of sparsity and local linearity.

The paper tackles adversarial attacks on deep neural networks by proposing sparse representations as a defense, showing that a sparsifying front end reduces output distortion by a factor of roughly K/N for linear classifiers and demonstrates efficacy on MNIST.

It is by now well-known that small adversarial perturbations can induce classification errors in deep neural networks (DNNs). In this paper, we make the case that sparse representations of the input data are a crucial tool for combating such attacks. For linear classifiers, we show that a sparsifying front end is provably effective against $\ell_{\infty}$-bounded attacks, reducing output distortion due to the attack by a factor of roughly $K / N$ where $N$ is the data dimension and $K$ is the sparsity level. We then extend this concept to DNNs, showing that a "locally linear" model can be used to develop a theoretical foundation for crafting attacks and defenses. Experimental results for the MNIST dataset show the efficacy of the proposed sparsifying front end.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes