Boosting Verified Training for Robust Image Classifications via Abstraction
This addresses the need for more robust and verifiable image classifiers, particularly for security-critical applications, with significant performance gains over existing methods.
The paper tackles the problem of improving robustness in image classifiers by proposing an abstraction-based certified training method, which reduces verified errors by up to 95.64% and achieves up to 602.50x speedup while scaling to models with up to 138 million parameters.
This paper proposes a novel, abstraction-based, certified training method for robust image classifiers. Via abstraction, all perturbed images are mapped into intervals before feeding into neural networks for training. By training on intervals, all the perturbed images that are mapped to the same interval are classified as the same label, rendering the variance of training sets to be small and the loss landscape of the models to be smooth. Consequently, our approach significantly improves the robustness of trained models. For the abstraction, our training method also enables a sound and complete black-box verification approach, which is orthogonal and scalable to arbitrary types of neural networks regardless of their sizes and architectures. We evaluate our method on a wide range of benchmarks in different scales. The experimental results show that our method outperforms state of the art by (i) reducing the verified errors of trained models up to 95.64%; (ii) totally achieving up to 602.50x speedup; and (iii) scaling up to larger models with up to 138 million trainable parameters. The demo is available at https://github.com/zhangzhaodi233/ABSCERT.git.