QuASE: Question-Answer Driven Sentence Encoding
This work addresses the need for alternative supervision methods in NLP, potentially benefiting various downstream tasks, though it appears incremental as it builds on existing contextual language models like BERT.
The paper tackles the problem of using question-answering data to improve non-QA NLP tasks, such as named entity recognition, by proposing the QuASE framework, which learns sentence encodings from QA data and serves as an easy-to-use plugin for downstream tasks.
Question-answering (QA) data often encodes essential information in many facets. This paper studies a natural question: Can we get supervision from QA data for other tasks (typically, non-QA ones)? For example, {\em can we use QAMR (Michael et al., 2017) to improve named entity recognition?} We suggest that simply further pre-training BERT is often not the best option, and propose the {\em question-answer driven sentence encoding (QuASE)} framework. QuASE learns representations from QA data, using BERT or other state-of-the-art contextual language models. In particular, we observe the need to distinguish between two types of sentence encodings, depending on whether the target task is a single- or multi-sentence input; in both cases, the resulting encoding is shown to be an easy-to-use plugin for many downstream tasks. This work may point out an alternative way to supervise NLP tasks.