CLDec 17, 2022

Know What I don't Know: Handling Ambiguous and Unanswerable Questions for Text-to-SQL

Bing Wang, Yan Gao, Zhoujun Li, Jian-Guang Lou

arXiv:2212.08902v21.612 citationsh-index: 37Has Code

Originality Incremental advance

AI Analysis

This addresses a critical limitation in text-to-SQL systems for database users, though it is incremental as it builds on existing parsing methods.

The paper tackles the problem of text-to-SQL parsers failing to handle ambiguous and unanswerable user questions by proposing a counterfactual example generation approach and a DTE model for error detection and explanation, achieving the best results on real-world and generated examples compared to baselines.

The task of text-to-SQL aims to convert a natural language question into its corresponding SQL query within the context of relational tables. Existing text-to-SQL parsers generate a "plausible" SQL query for an arbitrary user question, thereby failing to correctly handle problematic user questions. To formalize this problem, we conduct a preliminary study on the observed ambiguous and unanswerable cases in text-to-SQL and summarize them into 6 feature categories. Correspondingly, we identify the causes behind each category and propose requirements for handling ambiguous and unanswerable questions. Following this study, we propose a simple yet effective counterfactual example generation approach that automatically produces ambiguous and unanswerable text-to-SQL examples. Furthermore, we propose a weakly supervised DTE (Detecting-Then-Explaining) model for error detection, localization, and explanation. Experimental results show that our model achieves the best result on both real-world examples and generated examples compared with various baselines. We release our data and code at: \href{https://github.com/wbbeyourself/DTE}{https://github.com/wbbeyourself/DTE}.

View on arXiv PDF Code

Similar