Context Filtering with Reward Modeling in Question Answering
This work addresses the challenge of improving QA model efficiency in low-resource settings by filtering out non-essential context, though it appears incremental as it builds on existing summarization and reward modeling techniques.
The paper tackles the problem of irrelevant information in retrieved contexts hindering question answering performance by introducing a context filtering approach that uses reward modeling to summarize crucial content, resulting in a 6.8-fold increase in the EM Per Token metric for token efficiency.
Question Answering (QA) in NLP is the task of finding answers to a query within a relevant context retrieved by a retrieval system. Yet, the mix of relevant and irrelevant information in these contexts can hinder performance enhancements in QA tasks. To address this, we introduce a context filtering approach that removes non-essential details, summarizing crucial content through Reward Modeling. This method emphasizes keeping vital data while omitting the extraneous during summarization model training. We offer a framework for developing efficient QA models by discerning useful information from dataset pairs, bypassing the need for costly human evaluation. Furthermore, we show that our approach can significantly outperform the baseline, as evidenced by a 6.8-fold increase in the EM Per Token (EPT) metric, which we propose as a measure of token efficiency, indicating a notable token-efficiency boost for low-resource settings.