CVSep 12, 2022

Switchable Online Knowledge Distillation

arXiv:2209.04996v144 citationsh-index: 77Has Code
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in knowledge distillation for machine learning practitioners, offering an incremental improvement over existing methods.

The paper tackles the problem of performance degradation in online knowledge distillation due to the gap between teacher and student models by proposing Switchable Online Knowledge Distillation (SwitOKD), which adaptively calibrates the distillation gap during training using a switching strategy and an adaptive threshold, resulting in improved student performance validated through extensive experiments.

Online Knowledge Distillation (OKD) improves the involved models by reciprocally exploiting the difference between teacher and student. Several crucial bottlenecks over the gap between them -- e.g., Why and when does a large gap harm the performance, especially for student? How to quantify the gap between teacher and student? -- have received limited formal study. In this paper, we propose Switchable Online Knowledge Distillation (SwitOKD), to answer these questions. Instead of focusing on the accuracy gap at test phase by the existing arts, the core idea of SwitOKD is to adaptively calibrate the gap at training phase, namely distillation gap, via a switching strategy between two modes -- expert mode (pause the teacher while keep the student learning) and learning mode (restart the teacher). To possess an appropriate distillation gap, we further devise an adaptive switching threshold, which provides a formal criterion as to when to switch to learning mode or expert mode, and thus improves the student's performance. Meanwhile, the teacher benefits from our adaptive switching threshold and keeps basically on a par with other online arts. We further extend SwitOKD to multiple networks with two basis topologies. Finally, extensive experiments and analysis validate the merits of SwitOKD for classification over the state-of-the-arts. Our code is available at https://github.com/hfutqian/SwitOKD.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes