CVDec 6, 2023

Indirect Gradient Matching for Adversarial Robust Distillation

arXiv:2312.03286v26 citationsh-index: 5ICLR
AI Analysis

This addresses the adversarial robustness gap between large and small models for security-critical applications, representing an incremental improvement over existing adversarial distillation methods.

The paper tackles the problem of improving adversarial robustness in smaller models through adversarial distillation, proposing an Indirect Gradient Distillation Module (IGDM) that transfers input gradient knowledge from teacher to student models. Experimental results show IGDM improves AutoAttack accuracy on CIFAR-100 from 28.06% to 30.32% for ResNet-18 and from 26.18% to 29.32% for MobileNetV2 when integrated with state-of-the-art methods.

Adversarial training significantly improves adversarial robustness, but superior performance is primarily attained with large models. This substantial performance gap for smaller models has spurred active research into adversarial distillation (AD) to mitigate the difference. Existing AD methods leverage the teacher's logits as a guide. In contrast to these approaches, we aim to transfer another piece of knowledge from the teacher, the input gradient. In this paper, we propose a distillation module termed Indirect Gradient Distillation Module (IGDM) that indirectly matches the student's input gradient with that of the teacher. Experimental results show that IGDM seamlessly integrates with existing AD methods, significantly enhancing their performance. Particularly, utilizing IGDM on the CIFAR-100 dataset improves the AutoAttack accuracy from 28.06% to 30.32% with the ResNet-18 architecture and from 26.18% to 29.32% with the MobileNetV2 architecture when integrated into the SOTA method without additional data augmentation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes