CLIRMar 23, 2025

GINGER: Grounded Information Nugget-Based Generation of Responses

arXiv:2503.18174v117 citationsh-index: 47SIGIR
Originality Incremental advance
AI Analysis

This addresses the problem of improving factual accuracy and attribution in RAG systems for users relying on AI-generated responses, though it appears incremental as it builds on existing RAG methods with a novel modular approach.

The paper tackles challenges in retrieval-augmented generation (RAG) related to factual correctness and source attribution by proposing GINGER, a modular pipeline that uses information nuggets for grounded response generation, achieving state-of-the-art performance on the TREC RAG'24 dataset.

Retrieval-augmented generation (RAG) faces challenges related to factual correctness, source attribution, and response completeness. To address them, we propose a modular pipeline for grounded response generation that operates on information nuggets-minimal, atomic units of relevant information extracted from retrieved documents. The multistage pipeline encompasses nugget detection, clustering, ranking, top cluster summarization, and fluency enhancement. It guarantees grounding in specific facts, facilitates source attribution, and ensures maximum information inclusion within length constraints. Extensive experiments on the TREC RAG'24 dataset evaluated with the AutoNuggetizer framework demonstrate that GINGER achieves state-of-the-art performance on this benchmark.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes