LGJan 17, 2022

Distillation from heterogeneous unlabeled collections

arXiv:2201.06507v1
Originality Incremental advance
AI Analysis

This addresses the need for model compression in constrained settings where data is scarce, offering a practical solution for image classification tasks.

The paper tackles the problem of compressing deep networks when original training data is unavailable by distilling knowledge from a large teacher to a smaller student using unlabeled, heterogeneous data. It achieves performance close to that with original data through preferential sampling and enhanced learning signals.

Compressing deep networks is essential to expand their range of applications to constrained settings. The need for compression however often arises long after the model was trained, when the original data might no longer be available. On the other hand, unlabeled data, not necessarily related to the target task, is usually plentiful, especially in image classification tasks. In this work, we propose a scheme to leverage such samples to distill the knowledge learned by a large teacher network to a smaller student. The proposed technique relies on (i) preferentially sampling datapoints that appear related, and (ii) taking better advantage of the learning signal. We show that the former speeds up the student's convergence, while the latter boosts its performance, achieving performances closed to what can be expected with the original data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes