Spartan Networks: Self-Feature-Squeezing Neural Networks for increased robustness in adversarial settings
This work addresses adversarial robustness for deep learning systems, but it is incremental as it builds on existing adversarial defense methods with a specific technique.
The paper tackles the vulnerability of deep learning models to adversarial examples by introducing Spartan Networks, which use a new activation function to discard irrelevant input information, resulting in slightly lower precision but higher robustness under attack on the MNIST dataset.
Deep learning models are vulnerable to adversarial examples which are input samples modified in order to maximize the error on the system. We introduce Spartan Networks, resistant deep neural networks that do not require input preprocessing nor adversarial training. These networks have an adversarial layer designed to discard some information of the network, thus forcing the system to focus on relevant input. This is done using a new activation function to discard data. The added layer trains the neural network to filter-out usually-irrelevant parts of its input. Our performance evaluation shows that Spartan Networks have a slightly lower precision but report a higher robustness under attack when compared to unprotected models. Results of this study of Adversarial AI as a new attack vector are based on tests conducted on the MNIST dataset.