CLLGJul 14, 2023

Composition-contrastive Learning for Sentence Embeddings

arXiv:2307.07380v1230 citationsh-index: 29Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of efficient sentence representation learning for search applications, offering a novel approach that is incremental in its method.

The paper tackled learning sentence embeddings by maximizing alignment between texts and compositions of their phrasal constituents, resulting in improvements on semantic textual similarity tasks comparable to state-of-the-art methods without extra costs.

Vector representations of natural language are ubiquitous in search applications. Recently, various methods based on contrastive learning have been proposed to learn textual representations from unlabelled data; by maximizing alignment between minimally-perturbed embeddings of the same text, and encouraging a uniform distribution of embeddings across a broader corpus. Differently, we propose maximizing alignment between texts and a composition of their phrasal constituents. We consider several realizations of this objective and elaborate the impact on representations in each case. Experimental results on semantic textual similarity tasks show improvements over baselines that are comparable with state-of-the-art approaches. Moreover, this work is the first to do so without incurring costs in auxiliary training objectives or additional network parameters.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes