LGMLSep 11, 2020

Achieving Adversarial Robustness via Sparsity

arXiv:2009.05423v120 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of improving adversarial robustness in neural networks for security-critical applications, representing an incremental advance in adversarial training methods.

The paper tackles the problem of understanding how network pruning affects adversarial robustness, finding that weight sparsity actually improves robustness rather than hurting it. They propose an inverse weights inheritance method that imposes sparse weight distributions on large networks to enhance robustness.

Network pruning has been known to produce compact models without much accuracy degradation. However, how the pruning process affects a network's robustness and the working mechanism behind remain unresolved. In this work, we theoretically prove that the sparsity of network weights is closely associated with model robustness. Through experiments on a variety of adversarial pruning methods, we find that weights sparsity will not hurt but improve robustness, where both weights inheritance from the lottery ticket and adversarial training improve model robustness in network pruning. Based on these findings, we propose a novel adversarial training method called inverse weights inheritance, which imposes sparse weights distribution on a large network by inheriting weights from a small network, thereby improving the robustness of the large network.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes