MixACM: Mixup-Based Robustness Transfer via Distillation of Activated Channel Maps
This work addresses the efficiency and performance degradation issues of adversarial training for practitioners in security-critical domains, though it is incremental as it builds on existing transfer and mixup techniques.
The paper tackles the problem of adversarial robustness in deep neural networks by proposing a knowledge transfer method that uses mixup augmentation to transfer robustness from an adversarially trained teacher model to a student model, achieving improved robustness and generalization on natural images without expensive adversarial training.
Deep neural networks are susceptible to adversarially crafted, small and imperceptible changes in the natural inputs. The most effective defense mechanism against these examples is adversarial training which constructs adversarial examples during training by iterative maximization of loss. The model is then trained to minimize the loss on these constructed examples. This min-max optimization requires more data, larger capacity models, and additional computing resources. It also degrades the standard generalization performance of a model. Can we achieve robustness more efficiently? In this work, we explore this question from the perspective of knowledge transfer. First, we theoretically show the transferability of robustness from an adversarially trained teacher model to a student model with the help of mixup augmentation. Second, we propose a novel robustness transfer method called Mixup-Based Activated Channel Maps (MixACM) Transfer. MixACM transfers robustness from a robust teacher to a student by matching activated channel maps generated without expensive adversarial perturbations. Finally, extensive experiments on multiple datasets and different learning scenarios show our method can transfer robustness while also improving generalization on natural images.