A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations
This addresses the problem of adversarial robustness for deep learning models in image classification and segmentation, but it appears incremental as it builds on existing defense strategies.
The paper tackles the vulnerability of deep convolutional models to adversarial perturbations by proposing a non-linear radial basis convolutional feature mapping that maps features onto a linearly well-separated manifold, resulting in increased resilience to attacks without accuracy drop on clean data.
The linear and non-flexible nature of deep convolutional models makes them vulnerable to carefully crafted adversarial perturbations. To tackle this problem, we propose a non-linear radial basis convolutional feature mapping by learning a Mahalanobis-like distance function. Our method then maps the convolutional features onto a linearly well-separated manifold, which prevents small adversarial perturbations from forcing a sample to cross the decision boundary. We test the proposed method on three publicly available image classification and segmentation datasets namely, MNIST, ISBI ISIC 2017 skin lesion segmentation, and NIH Chest X-Ray-14. We evaluate the robustness of our method to different gradient (targeted and untargeted) and non-gradient based attacks and compare it to several non-gradient masking defense strategies. Our results demonstrate that the proposed method can increase the resilience of deep convolutional neural networks to adversarial perturbations without accuracy drop on clean data.