CLNov 9, 2024

KBM: Delineating Knowledge Boundary for Adaptive Retrieval in Large Language Models

Zhen Zhang, Xinyu Wang, Yong Jiang, Zile Qiao, Zhuo Chen, Guangyu Li, Feiteng Mu, Mengting Hu, Pengjun Xie, Fei Huang

arXiv:2411.06207v22.73 citationsh-index: 29EMNLP

Originality Incremental advance

AI Analysis

This addresses efficiency and cost issues in LLM applications by adaptively triggering retrieval, though it is incremental as it builds on existing RAG methods.

The paper tackles the problem of reducing unnecessary retrieval in Retrieval-Augmented Generation (RAG) for Large Language Models by proposing a Knowledge Boundary Model (KBM) to determine when retrieval is needed, resulting in a significant decrease in retrieval proportion for optimal performance across 11 datasets.

Large Language Models (LLMs) often struggle with dynamically changing knowledge and handling unknown static information. Retrieval-Augmented Generation (RAG) is employed to tackle these challenges and has a significant impact on improving LLM performance. In fact, we find that not all questions need to trigger RAG. By retrieving parts of knowledge unknown to the LLM and allowing the LLM to answer the rest, we can effectively reduce both time and computational costs. In our work, we propose a Knowledge Boundary Model (KBM) to express the known/unknown of a given question, and to determine whether a RAG needs to be triggered. Experiments conducted on 11 English and Chinese datasets illustrate that the KBM effectively delineates the knowledge boundary, significantly decreasing the proportion of retrievals required for optimal end-to-end performance. Furthermore, we evaluate the effectiveness of KBM in three complex scenarios: dynamic knowledge, long-tail static knowledge, and multi-hop problems, as well as its functionality as an external LLM plug-in.

View on arXiv PDF

Similar