LGDec 4, 2021

Extracting knowledge from features with multilevel abstraction

arXiv:2112.13642v1
Originality Incremental advance
AI Analysis

This work addresses the need for efficient model deployment on low-resource devices through an incremental improvement in self-knowledge distillation techniques.

The paper tackles the problem of self-knowledge distillation by proposing a novel method that distills knowledge from multilevel abstraction features, showing great effectiveness and generalization across various tasks and model structures.

Knowledge distillation aims at transferring the knowledge from a large teacher model to a small student model with great improvements of the performance of the student model. Therefore, the student network can replace the teacher network to deploy on low-resource devices since the higher performance, lower number of parameters and shorter inference time. Self-knowledge distillation (SKD) attracts a great attention recently that a student model itself is a teacher model distilling knowledge from. To the best of our knowledge, self knowledge distillation can be divided into two main streams: data augmentation and refined knowledge auxiliary. In this paper, we purpose a novel SKD method in a different way from the main stream methods. Our method distills knowledge from multilevel abstraction features. Experiments and ablation studies show its great effectiveness and generalization on various kinds of tasks with various kinds of model structures. Our codes have been released on GitHub.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes