Voting based ensemble improves robustness of defensive models
This work provides an incremental improvement in adversarial robustness for deep learning models, which is important for applications requiring high security and reliability.
This paper explores whether ensembling pre-trained robust models can further improve robustness against adversarial perturbations. They demonstrate that a hard-label based voting ensemble, given sufficiently diverse robust training losses, can boost the robust error over individual models, achieving 59.8% robust accuracy on CIFAR-10 without additional data.
Developing robust models against adversarial perturbations has been an active area of research and many algorithms have been proposed to train individual robust models. Taking these pretrained robust models, we aim to study whether it is possible to create an ensemble to further improve robustness. Several previous attempts tackled this problem by ensembling the soft-label prediction and have been proved vulnerable based on the latest attack methods. In this paper, we show that if the robust training loss is diverse enough, a simple hard-label based voting ensemble can boost the robust error over each individual model. Furthermore, given a pool of robust models, we develop a principled way to select which models to ensemble. Finally, to verify the improved robustness, we conduct extensive experiments to study how to attack a voting-based ensemble and develop several new white-box attacks. On CIFAR-10 dataset, by ensembling several state-of-the-art pre-trained defense models, our method can achieve a 59.8% robust accuracy, outperforming all the existing defensive models without using additional data.