AICVSep 20, 2024

Simple Unsupervised Knowledge Distillation With Space Similarity

arXiv:2409.13939v16 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses the challenge of training smaller networks without labels in self-supervised learning, though it is incremental as it builds on existing unsupervised knowledge distillation approaches.

The paper tackles the problem of extending self-supervised learning to smaller architectures by proposing an unsupervised knowledge distillation method that directly models the teacher's embedding manifold, achieving strong performance on various benchmarks.

As per recent studies, Self-supervised learning (SSL) does not readily extend to smaller architectures. One direction to mitigate this shortcoming while simultaneously training a smaller network without labels is to adopt unsupervised knowledge distillation (UKD). Existing UKD approaches handcraft preservation worthy inter/intra sample relationships between the teacher and its student. However, this may overlook/ignore other key relationships present in the mapping of a teacher. In this paper, instead of heuristically constructing preservation worthy relationships between samples, we directly motivate the student to model the teacher's embedding manifold. If the mapped manifold is similar, all inter/intra sample relationships are indirectly conserved. We first demonstrate that prior methods cannot preserve teacher's latent manifold due to their sole reliance on $L_2$ normalised embedding features. Subsequently, we propose a simple objective to capture the lost information due to normalisation. Our proposed loss component, termed \textbf{space similarity}, motivates each dimension of a student's feature space to be similar to the corresponding dimension of its teacher. We perform extensive experiments demonstrating strong performance of our proposed approach on various benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes