LGDec 4, 2021

Extracting knowledge from features with multilevel abstraction

arXiv:2112.13642v11.6

Originality Incremental advance

AI Analysis

This work addresses the need for efficient model deployment on low-resource devices through an incremental improvement in self-knowledge distillation techniques.

The paper tackles the problem of self-knowledge distillation by proposing a novel method that distills knowledge from multilevel abstraction features, showing great effectiveness and generalization across various tasks and model structures.

Knowledge distillation aims at transferring the knowledge from a large teacher model to a small student model with great improvements of the performance of the student model. Therefore, the student network can replace the teacher network to deploy on low-resource devices since the higher performance, lower number of parameters and shorter inference time. Self-knowledge distillation (SKD) attracts a great attention recently that a student model itself is a teacher model distilling knowledge from. To the best of our knowledge, self knowledge distillation can be divided into two main streams: data augmentation and refined knowledge auxiliary. In this paper, we purpose a novel SKD method in a different way from the main stream methods. Our method distills knowledge from multilevel abstraction features. Experiments and ablation studies show its great effectiveness and generalization on various kinds of tasks with various kinds of model structures. Our codes have been released on GitHub.

View on arXiv PDF

Similar