CLAIDec 17, 2024

Solid-SQL: Enhanced Schema-linking based In-context Learning for Robust Text-to-SQL

arXiv:2412.12522v128 citationsh-index: 15COLING
Originality Incremental advance
AI Analysis

This addresses robustness issues in text-to-SQL systems for database query applications, representing an incremental improvement over existing methods.

The paper tackles the problem of robustness in text-to-SQL systems by proposing Solid-SQL, which integrates with large language models to improve accuracy against adversarial perturbations, achieving state-of-the-art SQL execution accuracies of 82.1% on Spider and 58.9% on Bird, with an average 11.6% improvement on perturbed benchmarks.

Recently, large language models (LLMs) have significantly improved the performance of text-to-SQL systems. Nevertheless, many state-of-the-art (SOTA) approaches have overlooked the critical aspect of system robustness. Our experiments reveal that while LLM-driven methods excel on standard datasets, their accuracy is notably compromised when faced with adversarial perturbations. To address this challenge, we propose a robust text-to-SQL solution, called Solid-SQL, designed to integrate with various LLMs. We focus on the pre-processing stage, training a robust schema-linking model enhanced by LLM-based data augmentation. Additionally, we design a two-round, structural similarity-based example retrieval strategy for in-context learning. Our method achieves SOTA SQL execution accuracy levels of 82.1% and 58.9% on the general Spider and Bird benchmarks, respectively. Furthermore, experimental results show that Solid-SQL delivers an average improvement of 11.6% compared to baselines on the perturbed Spider-Syn, Spider-Realistic, and Dr. Spider benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes