Robustification of deep net classifiers by key based diversified aggregation with pre-filtering
This addresses the problem of adversarial attacks for machine learning practitioners, offering a defense strategy that is robust under strong attacker knowledge, though it appears incremental as it builds on existing randomization and aggregation techniques.
The paper tackles the vulnerability of deep neural network classifiers to adversarial attacks by proposing a Key based Diversified Aggregation (KDA) defense mechanism, which uses secret key-based randomization across multiple channels to prevent gradient backpropagation and bypass systems, achieving high robustness against gradient-based and non-gradient-based attacks like Carlini-Wagner and OnePixel.
In this paper, we address a problem of machine learning system vulnerability to adversarial attacks. We propose and investigate a Key based Diversified Aggregation (KDA) mechanism as a defense strategy. The KDA assumes that the attacker (i) knows the architecture of classifier and the used defense strategy, (ii) has an access to the training data set but (iii) does not know the secret key. The robustness of the system is achieved by a specially designed key based randomization. The proposed randomization prevents the gradients' back propagation or the creating of a "bypass" system. The randomization is performed simultaneously in several channels and a multi-channel aggregation stabilizes the results of randomization by aggregating soft outputs from each classifier in multi-channel system. The performed experimental evaluation demonstrates a high robustness and universality of the KDA against the most efficient gradient based attacks like those proposed by N. Carlini and D. Wagner and the non-gradient based sparse adversarial perturbations like OnePixel attacks.