IRLGJan 25, 2025

CG-RAG: Research Question Answering by Citation Graph Retrieval-Augmented LLMs

arXiv:2501.15067v114 citationsh-index: 10SIGIR
Originality Incremental advance
AI Analysis

It improves research question answering for scientific literature users by enhancing retrieval and generation, though it appears incremental as it builds on existing RAG and graph methods.

The paper tackled the problem of research question answering by addressing limitations in current Retrieval-Augmented Generation (RAG) methods, introducing CG-RAG, a framework that integrates sparse and dense retrieval with graph structures, resulting in significant outperformance over state-of-the-art RAG methods in retrieval accuracy and generation quality across multiple domains.

Research question answering requires accurate retrieval and contextual understanding of scientific literature. However, current Retrieval-Augmented Generation (RAG) methods often struggle to balance complex document relationships with precise information retrieval. In this paper, we introduce Contextualized Graph Retrieval-Augmented Generation (CG-RAG), a novel framework that integrates sparse and dense retrieval signals within graph structures to enhance retrieval efficiency and subsequently improve generation quality for research question answering. First, we propose a contextual graph representation for citation graphs, effectively capturing both explicit and implicit connections within and across documents. Next, we introduce Lexical-Semantic Graph Retrieval (LeSeGR), which seamlessly integrates sparse and dense retrieval signals with graph encoding. It bridges the gap between lexical precision and semantic understanding in citation graph retrieval, demonstrating generalizability to existing graph retrieval and hybrid retrieval methods. Finally, we present a context-aware generation strategy that utilizes the retrieved graph-structured information to generate precise and contextually enriched responses using large language models (LLMs). Extensive experiments on research question answering benchmarks across multiple domains demonstrate that our CG-RAG framework significantly outperforms RAG methods combined with various state-of-the-art retrieval approaches, delivering superior retrieval accuracy and generation quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes