SpotIt+: Verification-based Text-to-SQL Evaluation with Database Constraints

Rocky Klopfenstein, Yang He, Andrew Tremante, Yuepeng Wang, Nina Narodytska, Haoze Wu

arXiv:2603.04334v12.31 citationsh-index: 26Has Code

Originality Incremental advance

AI Analysis

This tool addresses the problem of accurately evaluating Text-to-SQL systems for developers, by identifying subtle errors that standard methods miss.

This paper introduces SpotIt+, a tool for evaluating Text-to-SQL systems by finding database instances where a generated SQL query differs from the ground truth. It incorporates a constraint-mining pipeline to generate more realistic differentiating databases, which helps uncover discrepancies missed by standard test-based evaluation on the BIRD dataset.

We present SpotIt+, an open-source tool for evaluating Text-to-SQL systems via bounded equivalence verification. Given a generated SQL query and the ground truth, SpotIt+ actively searches for database instances that differentiate the two queries. To ensure that the generated counterexamples reflect practically relevant discrepancies, we introduce a constraint-mining pipeline that combines rule-based specification mining over example databases with LLM-based validation. Experimental results on the BIRD dataset show that the mined constraints enable SpotIt+ to generate more realistic differentiating databases, while preserving its ability to efficiently uncover numerous discrepancies between generated and gold SQL queries that are missed by standard test-based evaluation.

View on arXiv PDF

Similar