Will this Question be Answered? Question Filtering via Answer Model Distillation for Efficient Question Answering
This addresses efficiency for QA system users by enabling preemptive filtering, though it is incremental as it builds on existing Transformer models.
The paper tackles the problem of improving efficiency in Question Answering systems by filtering out questions unlikely to be answered, based on approximating answer confidence scores from question text alone, resulting in a ~60% reduction in computation with only a ~3-4% loss in Recall.
In this paper we propose a novel approach towards improving the efficiency of Question Answering (QA) systems by filtering out questions that will not be answered by them. This is based on an interesting new finding: the answer confidence scores of state-of-the-art QA systems can be approximated well by models solely using the input question text. This enables preemptive filtering of questions that are not answered by the system due to their answer confidence scores being lower than the system threshold. Specifically, we learn Transformer-based question models by distilling Transformer-based answering models. Our experiments on three popular QA datasets and one industrial QA benchmark demonstrate the ability of our question models to approximate the Precision/Recall curves of the target QA system well. These question models, when used as filters, can effectively trade off lower computation cost of QA systems for lower Recall, e.g., reducing computation by ~60%, while only losing ~3-4% of Recall.