CVNov 19, 2024

KDC-MAE: Knowledge Distilled Contrastive Mask Auto-Encoder

arXiv:2411.12270v1h-index: 2WACV
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of coordinating different SSL objectives for researchers in self-supervised learning, though it appears incremental as it combines existing methods.

The paper tackled the problem of integrating three major self-supervised learning frameworks—contrastive learning, self-distillation, and masked data modeling—into a joint architecture called KDC-MAE, resulting in improved learning performance across multiple modalities and tasks.

In this work, we attempted to extend the thought and showcase a way forward for the Self-supervised Learning (SSL) learning paradigm by combining contrastive learning, self-distillation (knowledge distillation) and masked data modelling, the three major SSL frameworks, to learn a joint and coordinated representation. The proposed technique of SSL learns by the collaborative power of different learning objectives of SSL. Hence to jointly learn the different SSL objectives we proposed a new SSL architecture KDC-MAE, a complementary masking strategy to learn the modular correspondence, and a weighted way to combine them coordinately. Experimental results conclude that the contrastive masking correspondence along with the KD learning objective has lent a hand to performing better learning for multiple modalities over multiple tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes