Query-limited Black-box Attacks to Classifiers
This addresses the challenge for adversaries in real-world settings where queries are costly or risky, though it is incremental as it builds on existing black-box attack methods.
The paper tackles the problem of black-box attacks on machine learning classifiers under query limitations, aiming to minimize queries while adhering to a feature modification cost budget, and achieves a reduction in queries to about one-tenth of those needed with a random strategy for low-cost scenarios.
We study black-box attacks on machine learning classifiers where each query to the model incurs some cost or risk of detection to the adversary. We focus explicitly on minimizing the number of queries as a major objective. Specifically, we consider the problem of attacking machine learning classifiers subject to a budget of feature modification cost while minimizing the number of queries, where each query returns only a class and confidence score. We describe an approach that uses Bayesian optimization to minimize the number of queries, and find that the number of queries can be reduced to approximately one tenth of the number needed through a random strategy for scenarios where the feature modification cost budget is low.