CLOct 20, 2023

DistillCSE: Distilled Contrastive Learning for Sentence Embeddings

arXiv:2310.13499v2132 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses the challenge of enhancing sentence embeddings for natural language processing tasks, though it is incremental as it builds on existing contrastive learning and distillation methods.

The paper tackles the problem of improving sentence embeddings by proposing DistillCSE, a framework that uses contrastive learning with knowledge distillation, but it faces overfitting issues due to high variance in teacher model logits; it introduces solutions like Group-P shuffling and averaging logits, achieving state-of-the-art performance on standard benchmarks.

This paper proposes the DistillCSE framework, which performs contrastive learning under the self-training paradigm with knowledge distillation. The potential advantage of DistillCSE is its self-enhancing feature: using a base model to provide additional supervision signals, a stronger model may be learned through knowledge distillation. However, the vanilla DistillCSE through the standard implementation of knowledge distillation only achieves marginal improvements due to severe overfitting. The further quantitative analyses demonstrate the reason that the standard knowledge distillation exhibits a relatively large variance of the teacher model's logits due to the essence of contrastive learning. To mitigate the issue induced by high variance, this paper accordingly proposed two simple yet effective solutions for knowledge distillation: a Group-P shuffling strategy as an implicit regularization and the averaging logits from multiple teacher components. Experiments on standard benchmarks demonstrate that the proposed DistillCSE outperforms many strong baseline methods and yields a new state-of-the-art performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes