CLJun 15, 2021

Question Answering Infused Pre-training of General-Purpose Contextualized Representations

arXiv:2106.08190v2651 citations
Originality Highly original
AI Analysis

This work addresses the need for more versatile pre-trained models that perform well on non-QA tasks without extensive fine-tuning, offering a novel approach with broad applicability.

The paper tackles the problem of learning general-purpose contextual representations by proposing a QA-infused pre-training objective, achieving large improvements over RoBERTa-large and previous SOTA on zero-shot and few-shot tasks across multiple datasets.

We propose a pre-training objective based on question answering (QA) for learning general-purpose contextual representations, motivated by the intuition that the representation of a phrase in a passage should encode all questions that the phrase can answer in context. To this end, we train a bi-encoder QA model, which independently encodes passages and questions, to match the predictions of a more accurate cross-encoder model on 80 million synthesized QA pairs. By encoding QA-relevant information, the bi-encoder's token-level representations are useful for non-QA downstream tasks without extensive (or in some cases, any) fine-tuning. We show large improvements over both RoBERTa-large and previous state-of-the-art results on zero-shot and few-shot paraphrase detection on four datasets, few-shot named entity recognition on two datasets, and zero-shot sentiment analysis on three datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes