On Monotonic Aggregation for Open-domain QA
This addresses a critical issue for speech-based retrieval systems by improving reliability in multi-source QA, though it is incremental as it builds on existing methods.
The paper tackled the problem of monotonicity in open-domain question answering, where adding sources should not decrease accuracy, and proposed the Judge-Specialist framework, which outperformed state-of-the-art methods on Natural Questions while ensuring monotonicity.
Question answering (QA) is a critical task for speech-based retrieval from knowledge sources, by sifting only the answers without requiring to read supporting documents. Specifically, open-domain QA aims to answer user questions on unrestricted knowledge sources. Ideally, adding a source should not decrease the accuracy, but we find this property (denoted as "monotonicity") does not hold for current state-of-the-art methods. We identify the cause, and based on that we propose Judge-Specialist framework. Our framework consists of (1) specialist retrievers/readers to cover individual sources, and (2) judge, a dedicated language model to select the final answer. Our experiments show that our framework not only ensures monotonicity, but also outperforms state-of-the-art multi-source QA methods on Natural Questions. Additionally, we show that our models robustly preserve the monotonicity against noise from speech recognition. We publicly release our code and setting.