CLAIOct 23, 2025

GlobalRAG: Enhancing Global Reasoning in Multi-hop Question Answering via Reinforcement Learning

arXiv:2510.20548v22 citationsh-index: 13
Originality Incremental advance
AI Analysis

This addresses the challenge of improving multi-hop QA for applications requiring complex reasoning, though it is incremental as it builds on existing reinforcement learning methods for RAG.

The paper tackled the problem of limited global reasoning in multi-hop question answering by proposing GlobalRAG, a reinforcement learning framework that decomposes questions and coordinates retrieval with reasoning, resulting in average improvements of 14.2% in EM and F1 scores while using only 8k training data.

Reinforcement learning has recently shown promise in improving retrieval-augmented generation (RAG). Despite these advances, its effectiveness in multi-hop question answering (QA) remains limited by two fundamental limitations: (i) global planning absence to structure multi-step reasoning, and (ii) unfaithful execution, which hinders effective query formulation and consistent use of retrieved evidence. We propose GlobalRAG, a reinforcement learning framework designed to enhance global reasoning in multi-hop QA. GlobalRAG decomposes questions into subgoals, coordinates retrieval with reasoning, and refines evidence iteratively. To guide this process, we introduce Planning Quality Reward and SubGoal Completion Reward, which encourage coherent planning and reliable subgoal execution. In addition, a progressive weight annealing strategy balances process-oriented and outcome-based objectives. Extensive experiments on both in-domain and out-of-domain benchmarks demonstrate that GlobalRAG significantly outperforms strong baselines while using only 8k training data (42% of the training data used by strong baselines), achieving average improvements of 14.2% in both EM and F1.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes