CL AI SEJan 16, 2025

A Study of In-Context-Learning-Based Text-to-SQL Errors

Jiawei Shen, Chengcheng Wan, Ruoyi Qiao, Jiazhen Zou, Hang Xu, Yuchen Shao, Yueling Zhang, Weikai Miao, Geguang Pu

arXiv:2501.09310v220.422 citationsh-index: 29Has Code

Originality Incremental advance

AI Analysis

This addresses correctness and efficiency issues in text-to-SQL systems for database query applications, presenting a novel framework with incremental improvements over prior methods.

The paper tackles the problem of errors in text-to-SQL tasks using in-context learning with large language models, finding widespread errors and proposing MapleRepair, which repairs 13.8% more queries with negligible mis-repairs and 67.4% less overhead compared to existing solutions.

Large language models (LLMs) have been adopted to perform text-to-SQL tasks, utilizing their in-context learning (ICL) capability to translate natural language questions into structured query language (SQL). However, such a technique faces correctness problems and requires efficient repairing solutions. In this paper, we conduct the first comprehensive study of text-to-SQL errors. Our study covers four representative ICL-based techniques, five basic repairing methods, two benchmarks, and two LLM settings. We find that text-to-SQL errors are widespread and summarize 29 error types of 7 categories. We also find that existing repairing attempts have limited correctness improvement at the cost of high computational overhead with many mis-repairs. Based on the findings, we propose MapleRepair, a novel text-to-SQL error detection and repairing framework. The evaluation demonstrates that MapleRepair outperforms existing solutions by repairing 13.8% more queries with neglectable mis-repairs and 67.4% less overhead.

View on arXiv PDF Code

Similar