CLApr 24

Bridging the Long-Tail Gap: Robust Retrieval-Augmented Relation Completion via Multi-Stage Paraphrase Infusion

Fahmida Alam, Mihai Surdeanu, Ellen Riloff

arXiv:2604.2226126.5h-index: 1

AI Analysis

For practitioners needing robust relation completion on sparse data, this method offers a training-free, computationally efficient improvement over existing RAG approaches.

The paper tackles relation completion (RC) for large language models, especially for rare relations. The proposed RC-RAG framework, which integrates relation paraphrases into retrieval, summarization, and generation without fine-tuning, improves Exact Match by up to 40.6 points over standalone LLMs and 16.0/13.8 points over strong RAG baselines in long-tail settings.

Large language models (LLMs) struggle with relation completion (RC), both with and without retrieval-augmented generation (RAG), particularly when the required information is rare or sparsely represented. To address this, we propose a novel multi-stage paraphrase-guided relation-completion framework, RC-RAG, that systematically incorporates relation paraphrases across multiple stages. In particular, RC-RAG: (a) integrates paraphrases into retrieval to expand lexical coverage of the relation, (b) uses paraphrases to generate relation-aware summaries, and (c) leverages paraphrases during generation to guide reasoning for relation completion. Importantly, our method does not require any model fine-tuning. Experiments with five LLMs on two benchmark datasets show that RC-RAG consistently outperforms several RAG baselines. In long-tail settings, the best-performing LLM augmented with RC-RAG improves by 40.6 Exact Match (EM) points over its standalone performance and surpasses two strong RAG baselines by 16.0 and 13.8 EM points, respectively, while maintaining low computational overhead.

View on arXiv PDF

Similar