CLFeb 17, 2025

SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL

Jimin Lee, Ingeol Baek, Byeongjeong Kim, Hyunkyung Bae, Hwanhee Lee

arXiv:2502.11438v36 citationsh-index: 4EMNLP

Originality Incremental advance

AI Analysis

This addresses a bottleneck in text-to-SQL for practical applications where example retrieval fails, offering an incremental improvement.

The paper tackles the problem of text-to-SQL conversion in real-world scenarios where similar training examples are unavailable, proposing SAFE-SQL to generate and filter self-augmented examples, which achieves higher execution accuracy than previous zero-shot and few-shot frameworks.

Text-to-SQL aims to convert natural language questions into executable SQL queries. While previous approaches, such as skeleton-masked selection, have demonstrated strong performance by retrieving similar training examples to guide large language models (LLMs), they struggle in real-world scenarios where such examples are unavailable. To overcome this limitation, we propose Self-Augmentation in-context learning with Fine-grained Example selection for Text-to-SQL (SAFE-SQL), a novel framework that improves SQL generation by generating and filtering self-augmented examples. SAFE-SQL first prompts an LLM to generate multiple Text-to-SQL examples relevant to the test input. Then SAFE-SQL filters these examples through three relevance assessments, constructing high-quality in-context learning examples. Using self-generated examples, SAFE-SQL surpasses the previous zero-shot, and few-shot Text-to-SQL frameworks, achieving higher execution accuracy. Notably, our approach provides additional performance gains in extra hard and unseen scenarios, where conventional methods often fail.

View on arXiv PDF

Similar