AICLMar 11, 2025

SQLCritic: Correcting Text-to-SQL Generation via Clause-wise Critic

arXiv:2503.07996v41 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating accurate SQL queries from natural language for database users, representing an incremental improvement over existing refinement methods.

The paper tackles the problem of limited effectiveness in LLM-based Text-to-SQL refinement methods, which often introduce new errors and fail to correct semantic inaccuracies, by introducing SQLCritic, a clause-wise critique model that significantly improves SQL accuracy on BIRD and Spider datasets.

Existing refinement methods in LLM-based Text-to-SQL systems exhibit limited effectiveness. They often introduce new errors during the self-correction process and fail to detect and correct semantic inaccuracies. To address these gaps, we first introduce a clause-wise critique generation task along with a benchmark, SQLCriticBench, which performs fine-grained error localization including both syntax and semantic errors at the clause level. Furthermore, we introduce a variant of DPO for training our SQLCritic model, where the $β$ coefficient is adaptively changed according to the clause-level inconsistencies between the preferred and dispreferred critiques. We also propose an automatically training dataset curation pipeline which annotate clause-wise critique at scale in a cost-effective way. Experiments demonstrate that the SQLCritic model significantly improves SQL accuracy on the BIRD and Spider datasets, and the results on SQLCriticBench further reveals its superior critique capabilities compared to existing models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes