CVLGMLMar 21, 2017

Knowledge distillation using unlabeled mismatched images

arXiv:1703.07131v117 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of data scarcity or mismatch in knowledge distillation for image classification, but it is incremental as it builds on existing KD methods.

The paper tackled knowledge distillation for image classification by using unlabeled mismatched images as stimulus, showing that stimulus complexity is key for good performance, with examples on MNIST and CIFAR teachers.

Current approaches for Knowledge Distillation (KD) either directly use training data or sample from the training data distribution. In this paper, we demonstrate effectiveness of 'mismatched' unlabeled stimulus to perform KD for image classification networks. For illustration, we consider scenarios where this is a complete absence of training data, or mismatched stimulus has to be used for augmenting a small amount of training data. We demonstrate that stimulus complexity is a key factor for distillation's good performance. Our examples include use of various datasets for stimulating MNIST and CIFAR teachers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes