The Artificial Mind's Eye: Resisting Adversarials for Convolutional Neural Networks using Internal Projection
This addresses the critical issue of adversarial attacks for AI security, though it appears incremental as it builds on existing robustness methods.
The paper tackles the problem of adversarial robustness in convolutional neural networks by introducing a novel architecture that forces the network to redraw and compare images as a proof of object presence, resulting in improved resistance to adversarial inputs.
We introduce a novel artificial neural network architecture that integrates robustness to adversarial input in the network structure. The main idea of our approach is to force the network to make predictions on what the given instance of the class under consideration would look like and subsequently test those predictions. By forcing the network to redraw the relevant parts of the image and subsequently comparing this new image to the original, we are having the network give a "proof" of the presence of the object.