CLMay 13, 2025

IterKey: Iterative Keyword Generation with LLMs for Enhanced Retrieval Augmented Generation

arXiv:2505.08450v25 citationsh-index: 14
Originality Incremental advance
AI Analysis

This addresses the problem of balancing accuracy and interpretability in RAG for real-world applications, offering an incremental improvement over existing sparse retrieval methods.

The paper tackles the trade-off between accuracy and interpretability in Retrieval-Augmented Generation (RAG) by introducing IterKey, an LLM-driven iterative keyword generation framework that enhances sparse retrieval, achieving 5% to 20% accuracy improvements over BM25-based RAG baselines across four QA tasks.

Retrieval-Augmented Generation (RAG) has emerged as a way to complement the in-context knowledge of Large Language Models (LLMs) by integrating external documents. However, real-world applications demand not only accuracy but also interpretability. While dense retrieval methods provide high accuracy, they lack interpretability; conversely, sparse retrieval methods offer transparency but often fail to capture the full intent of queries due to their reliance on keyword matching. To address these issues, we introduce IterKey, an LLM-driven iterative keyword generation framework that enhances RAG via sparse retrieval. IterKey consists of three LLM-driven stages: generating keywords for retrieval, generating answers based on retrieved documents, and validating the answers. If validation fails, the process iteratively repeats with refined keywords. Across four QA tasks, experimental results show that IterKey achieves 5% to 20% accuracy improvements over BM25-based RAG and simple baselines. Its performance is comparable to dense retrieval-based RAG and prior iterative query refinement methods using dense models. In summary, IterKey is a novel BM25-based approach leveraging LLMs to iteratively refine RAG, effectively balancing accuracy with interpretability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes