CLDBAug 8, 2025

Confidence Estimation for Text-to-SQL in Large Language Models

arXiv:2508.14056v12 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses the reliability assessment of SQL queries generated by LLMs for database applications, but it is incremental as it builds on existing confidence estimation techniques.

The paper tackled the problem of confidence estimation for text-to-SQL in large language models, where access to model internals is limited, and found that consistency-based methods perform best for black-box models and SQL-syntax-aware approaches are advantageous for white-box settings, with execution-based grounding improving both.

Confidence estimation for text-to-SQL aims to assess the reliability of model-generated SQL queries without having access to gold answers. We study this problem in the context of large language models (LLMs), where access to model weights and gradients is often constrained. We explore both black-box and white-box confidence estimation strategies, evaluating their effectiveness on cross-domain text-to-SQL benchmarks. Our evaluation highlights the superior performance of consistency-based methods among black-box models and the advantage of SQL-syntax-aware approaches for interpreting LLM logits in white-box settings. Furthermore, we show that execution-based grounding of queries provides a valuable supplementary signal, improving the effectiveness of both approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes