CLAIOct 27, 2025

OraPlan-SQL: A Planning-Centric Framework for Complex Bilingual NL2SQL Reasoning

arXiv:2510.23870v11 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses the challenge of accurate and reliable SQL generation from natural language queries in bilingual contexts for database users and developers.

The paper tackles the problem of complex bilingual natural language to SQL conversion by introducing OraPlan-SQL, which achieved first place in the Archer NL2SQL Evaluation Challenge 2025 with execution accuracies of 55.0% in English and 56.7% in Chinese, exceeding the second-best system by over 6%.

We present OraPlan-SQL, our system for the Archer NL2SQL Evaluation Challenge 2025, a bilingual benchmark requiring complex reasoning such as arithmetic, commonsense, and hypothetical inference. OraPlan-SQL ranked first, exceeding the second-best system by more than 6% in execution accuracy (EX), with 55.0% in English and 56.7% in Chinese, while maintaining over 99% SQL validity (VA). Our system follows an agentic framework with two components: Planner agent that generates stepwise natural language plans, and SQL agent that converts these plans into executable SQL. Since SQL agent reliably adheres to the plan, our refinements focus on the planner. Unlike prior methods that rely on multiple sub-agents for planning and suffer from orchestration overhead, we introduce a feedback-guided meta-prompting strategy to refine a single planner. Failure cases from a held-out set are clustered with human input, and an LLM distills them into corrective guidelines that are integrated into the planner's system prompt, improving generalization without added complexity. For the multilingual scenario, to address transliteration and entity mismatch issues, we incorporate entity-linking guidelines that generate alternative surface forms for entities and explicitly include them in the plan. Finally, we enhance reliability through plan diversification: multiple candidate plans are generated for each query, with the SQL agent producing a query for each plan, and final output selected via majority voting over their executions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes