Adaptive Certified Training: Towards Better Accuracy-Robustness Tradeoffs
This work addresses the accuracy-robustness tradeoff for deploying robust deep learning models in real-world systems, representing an incremental improvement over existing certified training methods.
The paper tackles the problem of low standard accuracy in certified training methods for deep learning models by proposing an adaptive certified training approach that improves both accuracy and robustness, achieving up to two times higher robustness on CIFAR-10 and TinyImageNet at the same accuracy levels.
As deep learning models continue to advance and are increasingly utilized in real-world systems, the issue of robustness remains a major challenge. Existing certified training methods produce models that achieve high provable robustness guarantees at certain perturbation levels. However, the main problem of such models is a dramatically low standard accuracy, i.e. accuracy on clean unperturbed data, that makes them impractical. In this work, we consider a more realistic perspective of maximizing the robustness of a model at certain levels of (high) standard accuracy. To this end, we propose a novel certified training method based on a key insight that training with adaptive certified radii helps to improve both the accuracy and robustness of the model, advancing state-of-the-art accuracy-robustness tradeoffs. We demonstrate the effectiveness of the proposed method on MNIST, CIFAR-10, and TinyImageNet datasets. Particularly, on CIFAR-10 and TinyImageNet, our method yields models with up to two times higher robustness, measured as an average certified radius of a test set, at the same levels of standard accuracy compared to baseline approaches.