CLOct 23, 2020

Did You Ask a Good Question? A Cross-Domain Question Intention Classification Benchmark for Text-to-SQL

Yusen Zhang, Xiangyu Dong, Shuaichen Chang, Tao Yu, Peng Shi, Rui Zhang

arXiv:2010.12634v12.423 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This addresses the issue of handling unanswerable user inputs in real-world text-to-SQL applications, which is an incremental improvement over existing work that assumes all questions are legal.

The authors tackled the problem of distinguishing answerable from unanswerable questions in text-to-SQL systems, proposing TriageSQL as the first cross-domain benchmark for this task, with a baseline RoBERTa model achieving a 60% F1 score on the test set.

Neural models have achieved significant results on the text-to-SQL task, in which most current work assumes all the input questions are legal and generates a SQL query for any input. However, in the real scenario, users can input any text that may not be able to be answered by a SQL query. In this work, we propose TriageSQL, the first cross-domain text-to-SQL question intention classification benchmark that requires models to distinguish four types of unanswerable questions from answerable questions. The baseline RoBERTa model achieves a 60% F1 score on the test set, demonstrating the need for further improvement on this task. Our dataset is available at https://github.com/chatc/TriageSQL.

View on arXiv PDF Code

Similar