CL AIJun 20, 2024

A Learn-Then-Reason Model Towards Generalization in Knowledge Base Question Answering

Lingxi Zhang, Jing Zhang, Yanling Wang, Cuiping Li, Hong Chen

arXiv:2406.14763v11.0

Originality Highly original

AI Analysis

This addresses the limitation of existing KBQA models in generalizing to new knowledge, offering a flexible end-to-end solution for users accessing large knowledge bases via natural language questions.

The paper tackles the problem of improving generalization in Knowledge Base Question Answering (KBQA) by proposing KBLLaMA, a learn-then-reason framework that injects new knowledge into a large language model, achieving state-of-the-art performance with gains of up to 3.8% on GrailQA and 9.8% on Bio-chemical benchmarks.

Large-scale knowledge bases (KBs) like Freebase and Wikidata house millions of structured knowledge. Knowledge Base Question Answering (KBQA) provides a user-friendly way to access these valuable KBs via asking natural language questions. In order to improve the generalization capabilities of KBQA models, extensive research has embraced a retrieve-then-reason framework to retrieve relevant evidence for logical expression generation. These multi-stage efforts prioritize acquiring external sources but overlook the incorporation of new knowledge into their model parameters. In effect, even advanced language models and retrievers have knowledge boundaries, thereby limiting the generalization capabilities of previous KBQA models. Therefore, this paper develops KBLLaMA, which follows a learn-then-reason framework to inject new KB knowledge into a large language model for flexible end-to-end KBQA. At the core of KBLLaMA, we study (1) how to organize new knowledge about KBQA and (2) how to facilitate the learning of the organized knowledge. Extensive experiments on various KBQA generalization tasks showcase the state-of-the-art performance of KBLLaMA. Especially on the general benchmark GrailQA and domain-specific benchmark Bio-chemical, KBLLaMA respectively derives a performance gain of up to 3.8% and 9.8% compared to the baselines.

View on arXiv PDF

Similar