CLOct 8, 2022

InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings

Xing Wu, Chaochen Gao, Zijia Lin, Jizhong Han, Zhongyuan Wang, Songlin Hu

arXiv:2210.06432v324.7296 citationsh-index: 32Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for better unsupervised sentence representation learning in NLP, offering incremental improvements over existing methods like SimCSE.

The paper tackles the problem of weak constraints in contrastive learning for sentence embeddings by proposing InfoCSE, which aggregates denser sentence information through an additional masked language model task, resulting in state-of-the-art performance with average Spearman correlation improvements of 2.60% on BERT-base and 1.77% on BERT-large on semantic text similarity benchmarks.

Contrastive learning has been extensively studied in sentence embedding learning, which assumes that the embeddings of different views of the same sentence are closer. The constraint brought by this assumption is weak, and a good sentence representation should also be able to reconstruct the original sentence fragments. Therefore, this paper proposes an information-aggregated contrastive learning framework for learning unsupervised sentence embeddings, termed InfoCSE. InfoCSE forces the representation of [CLS] positions to aggregate denser sentence information by introducing an additional Masked language model task and a well-designed network. We evaluate the proposed InfoCSE on several benchmark datasets w.r.t the semantic text similarity (STS) task. Experimental results show that InfoCSE outperforms SimCSE by an average Spearman correlation of 2.60% on BERT-base, and 1.77% on BERT-large, achieving state-of-the-art results among unsupervised sentence representation learning methods. Our code are available at https://github.com/caskcsg/sentemb/tree/main/InfoCSE.

View on arXiv PDF Code

Similar