CLAIDBJan 31, 2025

KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search

MIT
arXiv:2501.18922v427 citationsh-index: 34Has CodeICML
Originality Highly original
AI Analysis

This work addresses the challenge of weak KB awareness and data efficiency in KBQA for AI systems, representing a strong specific gain rather than a foundational advancement.

The paper tackles the problem of Knowledge Base Question Answering (KBQA) with limited annotated data by proposing KBQA-o1, an agentic method using Monte Carlo Tree Search (MCTS), which boosts the Llama-3.1-8B model's GrailQA F1 performance to 78.5% compared to 48.5% for the previous state-of-the-art method.

Knowledge Base Question Answering (KBQA) aims to answer natural language questions with a large-scale structured knowledge base (KB). Despite advancements with large language models (LLMs), KBQA still faces challenges in weak KB awareness, imbalance between effectiveness and efficiency, and high reliance on annotated data. To address these challenges, we propose KBQA-o1, a novel agentic KBQA method with Monte Carlo Tree Search (MCTS). It introduces a ReAct-based agent process for stepwise logical form generation with KB environment exploration. Moreover, it employs MCTS, a heuristic search method driven by policy and reward models, to balance agentic exploration's performance and search space. With heuristic exploration, KBQA-o1 generates high-quality annotations for further improvement by incremental fine-tuning. Experimental results show that KBQA-o1 outperforms previous low-resource KBQA methods with limited annotated data, boosting Llama-3.1-8B model's GrailQA F1 performance to 78.5% compared to 48.5% of the previous sota method with GPT-3.5-turbo. Our code is publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes