Adversarial-Aware Deep Learning System based on a Secondary Classical Machine Learning Verification Approach
This addresses the problem of adversarial attacks in image classification for AI security, but it is incremental as it builds on existing defense ideas with a hybrid verification approach.
The paper tackles the vulnerability of deep learning models to adversarial attacks by proposing a system that uses a classical machine learning model as a secondary verification to detect attacks, and experiments on CIFAR-100 show it outperforms current state-of-the-art defense systems.
Deep learning models have been used in creating various effective image classification applications. However, they are vulnerable to adversarial attacks that seek to misguide the models into predicting incorrect classes. Our study of major adversarial attack models shows that they all specifically target and exploit the neural networking structures in their designs. This understanding makes us develop a hypothesis that most classical machine learning models, such as Random Forest (RF), are immune to adversarial attack models because they do not rely on neural network design at all. Our experimental study of classical machine learning models against popular adversarial attacks supports this hypothesis. Based on this hypothesis, we propose a new adversarial-aware deep learning system by using a classical machine learning model as the secondary verification system to complement the primary deep learning model in image classification. Although the secondary classical machine learning model has less accurate output, it is only used for verification purposes, which does not impact the output accuracy of the primary deep learning model, and at the same time, can effectively detect an adversarial attack when a clear mismatch occurs. Our experiments based on CIFAR-100 dataset show that our proposed approach outperforms current state-of-the-art adversarial defense systems.