CL AIDec 17, 2024

Solid-SQL: Enhanced Schema-linking based In-context Learning for Robust Text-to-SQL

Geling Liu, Yunzhi Tan, Ruichao Zhong, Yuanzhen Xie, Lingchen Zhao, Qian Wang, Bo Hu, Zang Li

arXiv:2412.12522v116.228 citationsh-index: 34COLING

Originality Incremental advance

AI Analysis

This addresses robustness issues in text-to-SQL systems for database query applications, representing an incremental improvement over existing methods.

The paper tackles the problem of robustness in text-to-SQL systems by proposing Solid-SQL, which integrates with large language models to improve accuracy against adversarial perturbations, achieving state-of-the-art SQL execution accuracies of 82.1% on Spider and 58.9% on Bird, with an average 11.6% improvement on perturbed benchmarks.

Recently, large language models (LLMs) have significantly improved the performance of text-to-SQL systems. Nevertheless, many state-of-the-art (SOTA) approaches have overlooked the critical aspect of system robustness. Our experiments reveal that while LLM-driven methods excel on standard datasets, their accuracy is notably compromised when faced with adversarial perturbations. To address this challenge, we propose a robust text-to-SQL solution, called Solid-SQL, designed to integrate with various LLMs. We focus on the pre-processing stage, training a robust schema-linking model enhanced by LLM-based data augmentation. Additionally, we design a two-round, structural similarity-based example retrieval strategy for in-context learning. Our method achieves SOTA SQL execution accuracy levels of 82.1% and 58.9% on the general Spider and Bird benchmarks, respectively. Furthermore, experimental results show that Solid-SQL delivers an average improvement of 11.6% compared to baselines on the perturbed Spider-Syn, Spider-Realistic, and Dr. Spider benchmarks.

View on arXiv PDF

Similar