Memory-Aware and Uncertainty-Guided Retrieval for Multi-Hop Question Answering
This work addresses inefficiencies in multi-hop QA for AI systems, though it is incremental as it builds on existing RAG methods.
The paper tackled the problem of multi-hop question answering by addressing limitations in existing Retrieval-Augmented Generation methods, such as fixed retrieval steps and ineffective knowledge reuse, and proposed the MIND framework, which improved performance by dynamically triggering retrievals and filtering memory, achieving a 5.2% accuracy gain on the HotpotQA benchmark.
Multi-hop question answering (QA) requires models to retrieve and reason over multiple pieces of evidence. While Retrieval-Augmented Generation (RAG) has made progress in this area, existing methods often suffer from two key limitations: (1) fixed or overly frequent retrieval steps, and (2) ineffective use of previously retrieved knowledge. We propose MIND (Memory-Informed and INteractive Dynamic RAG), a framework that addresses these challenges through: (i) prompt-based entity extraction to identify reasoning-relevant elements, (ii) dynamic retrieval triggering based on token-level entropy and attention signals, and (iii) memory-aware filtering, which stores high-confidence facts across reasoning steps to enable consistent multi-hop generation.