AICVDec 21, 2021

Multi-Modality Distillation via Learning the teacher's modality-level Gram Matrix

arXiv:2112.11447v1
Originality Incremental advance
AI Analysis

This addresses the issue of limited knowledge transfer in multi-modality distillation for researchers in machine learning, though it appears incremental as it builds on existing distillation methods.

The paper tackles the problem of deep differences between teacher and student networks in multi-modality knowledge distillation by forcing the student to learn the teacher's modality relationship information, resulting in a novel distillation paradigm based on learning the teacher's modality-level Gram Matrix.

In the context of multi-modality knowledge distillation research, the existing methods was mainly focus on the problem of only learning teacher final output. Thus, there are still deep differences between the teacher network and the student network. It is necessary to force the student network to learn the modality relationship information of the teacher network. To effectively exploit transfering knowledge from teachers to students, a novel modality relation distillation paradigm by modeling the relationship information among different modality are adopted, that is learning the teacher modality-level Gram Matrix.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes