Uncertainty Quantification in Retrieval Augmented Question Answering
This work addresses the issue of unreliable evidence in retrieval-augmented QA systems, which is incremental as it builds on existing methods to improve uncertainty quantification.
The paper tackles the problem of assessing the usefulness of retrieved passages in retrieval-augmented question answering by proposing a method to quantify uncertainty via passage utility prediction. The result shows that their lightweight neural model efficiently approximates or outperforms more expensive sampling-based methods in predicting answer correctness.
Retrieval augmented Question Answering (QA) helps QA models overcome knowledge gaps by incorporating retrieved evidence, typically a set of passages, alongside the question at test time. Previous studies show that this approach improves QA performance and reduces hallucinations, without, however, assessing whether the retrieved passages are indeed useful at answering correctly. In this work, we propose to quantify the uncertainty of a QA model via estimating the utility of the passages it is provided with. We train a lightweight neural model to predict passage utility for a target QA model and show that while simple information theoretic metrics can predict answer correctness up to a certain extent, our approach efficiently approximates or outperforms more expensive sampling-based methods. Code and data are available at https://github.com/lauhaide/ragu.