CLLGDec 8, 2020

Distilling Knowledge from Reader to Retriever for Question Answering

arXiv:2012.04584v2318 citations
AI Analysis

This work addresses the challenge of obtaining supervised data for training retriever models, which is a significant bottleneck for researchers and developers working on open-domain question answering systems.

This paper proposes a knowledge distillation technique to train retriever models for question answering without requiring annotated query-document pairs. It leverages attention scores from a reader model to generate synthetic labels for the retriever, achieving state-of-the-art results in question answering.

The task of information retrieval is an important component of many natural language processing systems, such as open domain question answering. While traditional methods were based on hand-crafted features, continuous representations based on neural networks recently obtained competitive results. A challenge of using such methods is to obtain supervised data to train the retriever model, corresponding to pairs of query and support documents. In this paper, we propose a technique to learn retriever models for downstream tasks, inspired by knowledge distillation, and which does not require annotated pairs of query and documents. Our approach leverages attention scores of a reader model, used to solve the task based on retrieved documents, to obtain synthetic labels for the retriever. We evaluate our method on question answering, obtaining state-of-the-art results.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes