PV-SQL: Synergizing Database Probing and Rule-based Verification for Text-to-SQL Agents
For text-to-SQL systems, PV-SQL addresses complex query failures with a novel agentic framework that outperforms existing baselines.
PV-SQL improves text-to-SQL accuracy by 5% and valid efficiency score by 20.8% on BIRD benchmarks through database probing and rule-based verification.
Text-to-SQL systems often struggle with deep contextual understanding, particularly for complex queries with subtle requirements. We present PV-SQL, an agentic framework that addresses these failures through two complementary components: Probe and Verify. The Probe component iteratively generates probing queries to retrieve concrete records from the database, resolving ambiguities in value formats, column semantics, and inter-table relationships to build richer contextual understanding. The Verify component employs a rule-based method to extract verifiable conditions and construct an executable checklist, enabling iterative SQL refinement that effectively reduces missing constraints. Experiments on the BIRD benchmarks show that PV-SQL outperforms the best text-to-SQL baseline by 5% in execution accuracy and 20.8% in valid efficiency score while consuming fewer tokens.