CVApr 23, 2019

Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More

arXiv:1904.10167v186 citations
Originality Highly original
AI Analysis

This addresses the need for efficient, versatile models in computer vision by enabling knowledge transfer across heterogeneous tasks without costly annotations, though it is incremental in building on existing teacher-student frameworks.

The paper tackles the problem of training a lightweight student model to perform multiple tasks (scene parsing and depth estimation) without human-labeled annotations by amalgamating knowledge from two pretrained teacher models, achieving results superior to the teachers and on par with state-of-the-art supervised models.

In this paper, we investigate a novel deep-model reusing task. Our goal is to train a lightweight and versatile student model, without human-labelled annotations, that amalgamates the knowledge and masters the expertise of two pretrained teacher models working on heterogeneous problems, one on scene parsing and the other on depth estimation. To this end, we propose an innovative training strategy that learns the parameters of the student intertwined with the teachers, achieved by 'projecting' its amalgamated features onto each teacher's domain and computing the loss. We also introduce two options to generalize the proposed training strategy to handle three or more tasks simultaneously. The proposed scheme yields very encouraging results. As demonstrated on several benchmarks, the trained student model achieves results even superior to those of the teachers in their own expertise domains and on par with the state-of-the-art fully supervised models relying on human-labelled annotations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes