CLMay 20, 2020

Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models

arXiv:2005.10389v11035 citations
Originality Incremental advance
AI Analysis

This work addresses the lack of focus on discourse-level representations in unsupervised text learning, offering a method that enhances performance across various NLP tasks, though it is incremental in building upon existing pretraining techniques.

The authors tackled the problem of improving discourse-level representations in language models by proposing CONPONO, an inter-sentence pretraining objective that models discourse coherence and sentence distance, resulting in up to 13% absolute improvement on the DiscoEval benchmark and 2%-6% gains on non-discourse tasks.

Recent models for unsupervised representation learning of text have employed a number of techniques to improve contextual word representations but have put little focus on discourse-level representations. We propose CONPONO, an inter-sentence objective for pretraining language models that models discourse coherence and the distance between sentences. Given an anchor sentence, our model is trained to predict the text k sentences away using a sampled-softmax objective where the candidates consist of neighboring sentences and sentences randomly sampled from the corpus. On the discourse representation benchmark DiscoEval, our model improves over the previous state-of-the-art by up to 13% and on average 4% absolute across 7 tasks. Our model is the same size as BERT-Base, but outperforms the much larger BERT- Large model and other more recent approaches that incorporate discourse. We also show that CONPONO yields gains of 2%-6% absolute even for tasks that do not explicitly evaluate discourse: textual entailment (RTE), common sense reasoning (COPA) and reading comprehension (ReCoRD).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes