LGFeb 5, 2021

Show, Attend and Distill:Knowledge Distillation via Attention-based Feature Matching

arXiv:2102.02973v1195 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the problem of ineffective manual link selection in knowledge distillation, which is a challenge for researchers and practitioners aiming to improve model compression and transfer learning.

This paper introduces a knowledge distillation method that automatically identifies effective feature links between teacher and student networks using an attention-based meta-network. This approach allows for more efficient determination of competent links and achieves better performance in model compression and transfer learning tasks compared to previous methods.

Knowledge distillation extracts general knowledge from a pre-trained teacher network and provides guidance to a target student network. Most studies manually tie intermediate features of the teacher and student, and transfer knowledge through pre-defined links. However, manual selection often constructs ineffective links that limit the improvement from the distillation. There has been an attempt to address the problem, but it is still challenging to identify effective links under practical scenarios. In this paper, we introduce an effective and efficient feature distillation method utilizing all the feature levels of the teacher without manually selecting the links. Specifically, our method utilizes an attention-based meta-network that learns relative similarities between features, and applies identified similarities to control distillation intensities of all possible pairs. As a result, our method determines competent links more efficiently than the previous approach and provides better performance on model compression and transfer learning tasks. Further qualitative analyses and ablative studies describe how our method contributes to better distillation. The implementation code is available at github.com/clovaai/attention-feature-distillation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes