Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey
It addresses the problem of making data interaction more intuitive for users by summarizing existing research, but it is incremental as it is a survey rather than new findings.
This survey provides a comprehensive overview of natural language interfaces for querying and visualizing tabular data, focusing on semantic parsing and recent advancements in Text-to-SQL and Text-to-Vis problems, including the impact of large language models.
The emergence of natural language processing has revolutionized the way users interact with tabular data, enabling a shift from traditional query languages and manual plotting to more intuitive, language-based interfaces. The rise of large language models (LLMs) such as ChatGPT and its successors has further advanced this field, opening new avenues for natural language processing techniques. This survey presents a comprehensive overview of natural language interfaces for tabular data querying and visualization, which allow users to interact with data using natural language queries. We introduce the fundamental concepts and techniques underlying these interfaces with a particular emphasis on semantic parsing, the key technology facilitating the translation from natural language to SQL queries or data visualization commands. We then delve into the recent advancements in Text-to-SQL and Text-to-Vis problems from the perspectives of datasets, methodologies, metrics, and system designs. This includes a deep dive into the influence of LLMs, highlighting their strengths, limitations, and potential for future improvements. Through this survey, we aim to provide a roadmap for researchers and practitioners interested in developing and applying natural language interfaces for data interaction in the era of large language models.