CVDec 11, 2022

Teaching What You Should Teach: A Data-Based Distillation Method

arXiv:2212.05422v66 citationsh-index: 23
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving knowledge distillation efficiency for machine learning practitioners, though it appears incremental as it builds on existing distillation frameworks.

The paper tackles the problem of inefficient knowledge distillation by proposing a data-based method that generates augmented samples matching the teacher's strengths and student's weaknesses, achieving state-of-the-art performance on tasks like object recognition and detection across multiple datasets.

In real teaching scenarios, an excellent teacher always teaches what he (or she) is good at but the student is not. This gives the student the best assistance in making up for his (or her) weaknesses and becoming a good one overall. Enlightened by this, we introduce the "Teaching what you Should Teach" strategy into a knowledge distillation framework, and propose a data-based distillation method named "TST" that searches for desirable augmented samples to assist in distilling more efficiently and rationally. To be specific, we design a neural network-based data augmentation module with priori bias, which assists in finding what meets the teacher's strengths but the student's weaknesses, by learning magnitudes and probabilities to generate suitable data samples. By training the data augmentation module and the generalized distillation paradigm in turn, a student model is learned with excellent generalization ability. To verify the effectiveness of our method, we conducted extensive comparative experiments on object recognition, detection, and segmentation tasks. The results on the CIFAR-10, ImageNet-1k, MS-COCO, and Cityscapes datasets demonstrate that our method achieves state-of-the-art performance on almost all teacher-student pairs. Furthermore, we conduct visualization studies to explore what magnitudes and probabilities are needed for the distillation process.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes