CL AI DBNov 24, 2025

Skeletons Matter: Dynamic Data Augmentation for Text-to-Query

Yuchen Ji, Bo Xu, Jie Shi, Jiaqing Liang, Deqing Yang, Yu Mao, Hai Chen, Yanghua Xiao

arXiv:2511.18934v11 citationsHas Code

Originality Highly original

AI Analysis

This work addresses the problem of unifying semantic parsing tasks for researchers and practitioners, though it is incremental as it builds on existing LLM advancements.

The paper tackles the limited generalizability of text-to-query methods across different query languages by proposing a dynamic data augmentation framework that diagnoses model weaknesses in handling query skeletons, achieving state-of-the-art performance on four benchmarks with only a small amount of synthesized data.

The task of translating natural language questions into query languages has long been a central focus in semantic parsing. Recent advancements in Large Language Models (LLMs) have significantly accelerated progress in this field. However, existing studies typically focus on a single query language, resulting in methods with limited generalizability across different languages. In this paper, we formally define the Text-to-Query task paradigm, unifying semantic parsing tasks across various query languages. We identify query skeletons as a shared optimization target of Text-to-Query tasks, and propose a general dynamic data augmentation framework that explicitly diagnoses model-specific weaknesses in handling these skeletons to synthesize targeted training data. Experiments on four Text-to-Query benchmarks demonstrate that our method achieves state-of-the-art performance using only a small amount of synthesized data, highlighting the efficiency and generality of our approach and laying a solid foundation for unified research on Text-to-Query tasks. We release our code at https://github.com/jjjycaptain/Skeletron.

View on arXiv PDF Code

Similar