CLMay 8

PolySQL: Scaling Text-to-SQL Evaluation Across SQL Dialects via Automated Backend Isomorphism

arXiv:2605.0779673.9
Predicted impact top 85% in CL · last 90 daysOriginality Incremental advance
AI Analysis

For researchers evaluating Text-to-SQL systems, PolySQL provides a scalable, automated method to assess dialect robustness, revealing that SQLite performance is an unreliable proxy for other dialects.

PolySQL introduces a dual-execution method for cross-dialect Text-to-SQL evaluation that eliminates query transpilation, achieving 100% query coverage and higher fidelity. The study reveals a 10.1% average accuracy drop from SQLite to other dialects, with 61% of errors being logical rather than syntactic.

SQL dialects vary in syntax, types, and functions across database engines. Text-to-SQL benchmarks, however, predominantly support only SQLite. This creates a critical evaluation gap: cross-dialect evaluation reveals weak per-query agreement (Cohen's ), showing that SQLite performance is an unreliable proxy for other dialects. Yet such evaluation remains prohibitively difficult: existing approaches either require expensive manual query transpilation or rely on tools that often fail on complex SQL. To close this gap, we introduce PolySQL, a novel dual-execution method that eliminates the need for query transpilation by comparing normalized execution results. Notably, our approach achieves higher evaluation fidelity than query transpilation with 100% query coverage. PolySQL comprises three datasets, enabling the first large-scale cross-dialect study. Our study reveals a 10.1% average accuracy drop from SQLite to other dialects and identifies a significant dialect difficulty hierarchy. We find this degradation stems from logical rather than syntactic errors (61% vs. 8%). We release our framework code and leaderboard to enable rigorous dialect-robust evaluation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes