LGCRNEMLOct 2, 2020

Query complexity of adversarial attacks

arXiv:2010.01039v29 citations
AI Analysis

This work addresses security concerns in machine learning by providing theoretical guarantees for adversarial robustness, though it is incremental as it builds on existing threat models.

The paper tackles the problem of adversarial attacks by analyzing the query complexity needed for an adversary to match white-box attack performance, establishing a lower bound based on classifier decision boundary entropy and demonstrating that certain learning algorithms are inherently more robust against query-bounded adversaries.

There are two main attack models considered in the adversarial robustness literature: black-box and white-box. We consider these threat models as two ends of a fine-grained spectrum, indexed by the number of queries the adversary can ask. Using this point of view we investigate how many queries the adversary needs to make to design an attack that is comparable to the best possible attack in the white-box model. We give a lower bound on that number of queries in terms of entropy of decision boundaries of the classifier. Using this result we analyze two classical learning algorithms on two synthetic tasks for which we prove meaningful security guarantees. The obtained bounds suggest that some learning algorithms are inherently more robust against query-bounded adversaries than others.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes