Deep-RBF Networks Revisited: Robust Classification with Rejection
This work addresses the problem of adversarial robustness in deep learning for security-critical applications, offering an incremental improvement by adapting existing deep-RBF networks.
The paper tackles the vulnerability of deep neural networks to adversarial attacks by revisiting deep-RBF networks, proposing a new learning algorithm to avoid vanishing gradients and adding a reject option for robustness, achieving significant classification accuracy and high resistance to attacks on MNIST.
One of the main drawbacks of deep neural networks, like many other classifiers, is their vulnerability to adversarial attacks. An important reason for their vulnerability is assigning high confidence to regions with few or even no feature points. By feature points, we mean a nonlinear transformation of the input space extracting a meaningful representation of the input data. On the other hand, deep-RBF networks assign high confidence only to the regions containing enough feature points, but they have been discounted due to the widely-held belief that they have the vanishing gradient problem. In this paper, we revisit the deep-RBF networks by first giving a general formulation for them, and then proposing a family of cost functions thereof inspired by metric learning. In the proposed deep-RBF learning algorithm, the vanishing gradient problem does not occur. We make these networks robust to adversarial attack by adding the reject option to their output layer. Through several experiments on the MNIST dataset, we demonstrate that our proposed method not only achieves significant classification accuracy but is also very resistant to various adversarial attacks.