Ruiying Geng

CL
h-index20
14papers
4,491citations
Novelty53%
AI Score32

14 Papers

CLAug 29, 2022
A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions

Bowen Qin, Binyuan Hui, Lihan Wang et al.

Text-to-SQL parsing is an essential and challenging task. The goal of text-to-SQL parsing is to convert a natural language (NL) question to its corresponding structured query language (SQL) based on the evidences provided by relational databases. Early text-to-SQL parsing systems from the database community achieved a noticeable progress with the cost of heavy human engineering and user interactions with the systems. In recent years, deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output SQL query. Subsequently, the large pre-trained language models have taken the state-of-the-art of the text-to-SQL parsing task to a new level. In this survey, we present a comprehensive review on deep learning approaches for text-to-SQL parsing. First, we introduce the text-to-SQL parsing corpora which can be categorized as single-turn and multi-turn. Second, we provide a systematical overview of pre-trained language models and existing methods for text-to-SQL parsing. Third, we present readers with the challenges faced by text-to-SQL parsing and explore some potential future directions in this field.

CLMar 14, 2022
S$^2$SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers

Binyuan Hui, Ruiying Geng, Lihan Wang et al.

The task of converting a natural language question into an executable SQL query, known as text-to-SQL, is an important branch of semantic parsing. The state-of-the-art graph-based encoder has been successfully used in this task but does not model the question syntax well. In this paper, we propose S$^2$SQL, injecting Syntax to question-Schema graph encoder for Text-to-SQL parsers, which effectively leverages the syntactic dependency information of questions in text-to-SQL to improve the performance. We also employ the decoupling constraint to induce diverse relational edge embedding, which further improves the network's performance. Experiments on the Spider and robustness setting Spider-Syn demonstrate that the proposed approach outperforms all existing methods when pre-training models are used, resulting in a performance ranks first on the Spider leaderboard.

CLJun 20, 2023
CATS: A Pragmatic Chinese Answer-to-Sequence Dataset with Large Scale and High Quality

Liang Li, Ruiying Geng, Chengyang Fang et al.

There are three problems existing in the popular data-to-text datasets. First, the large-scale datasets either contain noise or lack real application scenarios. Second, the datasets close to real applications are relatively small in size. Last, current datasets bias in the English language while leaving other languages underexplored. To alleviate these limitations, in this paper, we present CATS, a pragmatic Chinese answer-to-sequence dataset with large scale and high quality. The dataset aims to generate textual descriptions for the answer in the practical TableQA system. Further, to bridge the structural gap between the input SQL and table and establish better semantic alignments, we propose a Unified Graph Transformation approach to establish a joint encoding space for the two hybrid knowledge resources and convert this task to a graph-to-text problem. The experiment results demonstrate the effectiveness of our proposed method. Further analysis on CATS attests to both the high quality and challenges of the dataset.

CLFeb 10, 2023
Plan-then-Seam: Towards Efficient Table-to-Text Generation

Liang Li, Ruiying Geng, Chengyang Fang et al.

Table-to-text generation aims at automatically generating text to help people conveniently obtain salient information in tables. Recent works explicitly decompose the generation process into content planning and surface generation stages, employing two autoregressive networks for them respectively. However, they are computationally expensive due to the non-parallelizable nature of autoregressive decoding and the redundant parameters of two networks. In this paper, we propose the first totally non-autoregressive table-to-text model (Plan-then-Seam, PTS) that produces its outputs in parallel with one single network. PTS firstly writes and calibrates one plan of the content to be generated with a novel rethinking pointer predictor, and then takes the plan as the context for seaming to decode the description. These two steps share parameters and perform iteratively to capture token inter-dependency while keeping parallel decoding. Experiments on two public benchmarks show that PTS achieves 3.0~5.6 times speedup for inference time, reducing 50% parameters, while maintaining as least comparable performance against strong two-stage table-to-text competitors.

CLSep 15, 2022
Graph-to-Text Generation with Dynamic Structure Pruning

Liang Li, Ruiying Geng, Bowen Li et al.

Most graph-to-text works are built on the encoder-decoder framework with cross-attention mechanism. Recent studies have shown that explicitly modeling the input graph structure can significantly improve the performance. However, the vanilla structural encoder cannot capture all specialized information in a single forward pass for all decoding steps, resulting in inaccurate semantic representations. Meanwhile, the input graph is flatted as an unordered sequence in the cross attention, ignoring the original graph structure. As a result, the obtained input graph context vector in the decoder may be flawed. To address these issues, we propose a Structure-Aware Cross-Attention (SACA) mechanism to re-encode the input graph representation conditioning on the newly generated context at each decoding step in a structure aware manner. We further adapt SACA and introduce its variant Dynamic Graph Pruning (DGP) mechanism to dynamically drop irrelevant nodes in the decoding process. We achieve new state-of-the-art results on two graph-to-text datasets, LDC2020T02 and ENT-DESC, with only minor increase on computational cost.

CLJan 2, 2024Code
Unifying Structured Data as Graph for Data-to-Text Pre-Training

Shujie Li, Liang Li, Ruiying Geng et al.

Data-to-text (D2T) generation aims to transform structured data into natural language text. Data-to-text pre-training has proved to be powerful in enhancing D2T generation and yields impressive performances. However, previous pre-training methods either oversimplified structured data into a sequence without considering input structures or designed training objectives tailored for a specific data structure (e.g., table or knowledge graph). In this paper, we unify different types of structured data (i.e., table, key-value data, knowledge graph) into the graph format and cast different data-to-text generation tasks as graph-to-text generation. To effectively exploit the structural information of the input graph, we propose a structure-enhanced pre-training method for D2T generation by designing a structure-enhanced Transformer. Concretely, we devise a position matrix for the Transformer, encoding relative positional information of connected nodes in the input graph. In addition, we propose a new attention matrix to incorporate graph structures into the original Transformer by taking the available explicit connectivity structure into account. Extensive experiments on six benchmark datasets show the effectiveness of our model. Our source codes are available at https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/unid2t.

CLMay 4, 2023
Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs

Jinyang Li, Binyuan Hui, Ge Qu et al.

Text-to-SQL parsing, which aims at converting natural language instructions into executable SQLs, has gained increasing attention in recent years. In particular, Codex and ChatGPT have shown impressive results in this task. However, most of the prevalent benchmarks, i.e., Spider, and WikiSQL, focus on database schema with few rows of database contents leaving the gap between academic study and real-world applications. To mitigate this gap, we present Bird, a big benchmark for large-scale database grounded in text-to-SQL tasks, containing 12,751 pairs of text-to-SQL data and 95 databases with a total size of 33.4 GB, spanning 37 professional domains. Our emphasis on database values highlights the new challenges of dirty database contents, external knowledge between NL questions and database contents, and SQL efficiency, particularly in the context of massive databases. To solve these problems, text-to-SQL models must feature database value comprehension in addition to semantic parsing. The experimental results demonstrate the significance of database values in generating accurate text-to-SQLs for big databases. Furthermore, even the most effective text-to-SQL models, i.e. ChatGPT, only achieves 40.08% in execution accuracy, which is still far from the human result of 92.96%, proving that challenges still stand. Besides, we also provide an efficiency analysis to offer insights into generating text-to-efficient-SQLs that are beneficial to industries. We believe that BIRD will contribute to advancing real-world applications of text-to-SQL research. The leaderboard and source code are available: https://bird-bench.github.io/.

CLNov 18, 2021
Linking-Enhanced Pre-Training for Table Semantic Parsing

Bowen Qin, Lihan Wang, Binyuan Hui et al.

Recently pre-training models have significantly improved the performance of various NLP tasks by leveraging large-scale text corpora to improve the contextual representation ability of the neural network. The large pre-training language model has also been applied in the area of table semantic parsing. However, existing pre-training approaches have not carefully explored explicit interaction relationships between a question and the corresponding database schema, which is a key ingredient for uncovering their semantic and structural correspondence. Furthermore, the question-aware representation learning in the schema grounding context has received less attention in pre-training objective.To alleviate these issues, this paper designs two novel pre-training objectives to impose the desired inductive bias into the learned representations for table pre-training. We further propose a schema-aware curriculum learning approach to mitigate the impact of noise and learn effectively from the pre-training data in an easy-to-hard manner. We evaluate our pre-trained framework by fine-tuning it on two benchmarks, Spider and SQUALL. The results demonstrate the effectiveness of our pre-training objective and curriculum compared to a variety of baselines.

AIApr 27, 2021
Relational Learning with Gated and Attentive Neighbor Aggregator for Few-Shot Knowledge Graph Completion

Guanglin Niu, Yang Li, Chengguang Tang et al.

Aiming at expanding few-shot relations' coverage in knowledge graphs (KGs), few-shot knowledge graph completion (FKGC) has recently gained more research interests. Some existing models employ a few-shot relation's multi-hop neighbor information to enhance its semantic representation. However, noise neighbor information might be amplified when the neighborhood is excessively sparse and no neighbor is available to represent the few-shot relation. Moreover, modeling and inferring complex relations of one-to-many (1-N), many-to-one (N-1), and many-to-many (N-N) by previous knowledge graph completion approaches requires high model complexity and a large amount of training instances. Thus, inferring complex relations in the few-shot scenario is difficult for FKGC models due to limited training instances. In this paper, we propose a few-shot relational learning with global-local framework to address the above issues. At the global stage, a novel gated and attentive neighbor aggregator is built for accurately integrating the semantics of a few-shot relation's neighborhood, which helps filtering the noise neighbors even if a KG contains extremely sparse neighborhoods. For the local stage, a meta-learning based TransH (MTransH) method is designed to model complex relations and train our model in a few-shot learning fashion. Extensive experiments show that our model outperforms the state-of-the-art FKGC approaches on the frequently-used benchmark datasets NELL-One and Wiki-One. Compared with the strong baseline model MetaR, our model achieves 5-shot FKGC performance improvements of 8.0% on NELL-One and 2.8% on Wiki-One by the metric Hits@10.

CLMar 7, 2021
Improving Text-to-SQL with Schema Dependency Learning

Binyuan Hui, Xiang Shi, Ruiying Geng et al.

Text-to-SQL aims to map natural language questions to SQL queries. The sketch-based method combined with execution-guided (EG) decoding strategy has shown a strong performance on the WikiSQL benchmark. However, execution-guided decoding relies on database execution, which significantly slows down the inference process and is hence unsatisfactory for many real-world applications. In this paper, we present the Schema Dependency guided multi-task Text-to-SQL model (SDSQL) to guide the network to effectively capture the interactions between questions and schemas. The proposed model outperforms all existing methods in both the settings with or without EG. We show the schema dependency learning partially cover the benefit from EG and alleviates the need for it. SDSQL without EG significantly reduces time consumption during inference, sacrificing only a small amount of performance and provides more flexibility for downstream applications.

CLJan 5, 2021
Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing

Binyuan Hui, Ruiying Geng, Qiyu Ren et al.

Semantic parsing has long been a fundamental problem in natural language processing. Recently, cross-domain context-dependent semantic parsing has become a new focus of research. Central to the problem is the challenge of leveraging contextual information of both natural language utterance and database schemas in the interaction history. In this paper, we present a dynamic graph framework that is capable of effectively modelling contextual utterances, tokens, database schemas, and their complicated interaction as the conversation proceeds. The framework employs a dynamic memory decay mechanism that incorporates inductive bias to integrate enriched contextual relation representation, which is further enhanced with a powerful reranking model. At the time of writing, we demonstrate that the proposed framework outperforms all existing models by large margins, achieving new state-of-the-art performance on two large-scale benchmarks, the SParC and CoSQL datasets. Specifically, the model attains a 55.8% question-match and 30.8% interaction-match accuracy on SParC, and a 46.8% question-match and 17.0% interaction-match accuracy on CoSQL.

CLMay 12, 2020
Dynamic Memory Induction Networks for Few-Shot Text Classification

Ruiying Geng, Binhua Li, Yongbin Li et al.

This paper proposes Dynamic Memory Induction Networks (DMIN) for few-shot text classification. The model utilizes dynamic routing to provide more flexibility to memory-based few-shot learning in order to better adapt the support sets, which is a critical capacity of few-shot classification models. Based on that, we further develop induction models with query information, aiming to enhance the generalization ability of meta-learning. The proposed model achieves new state-of-the-art results on the miniRCV1 and ODIC dataset, improving the best performance (accuracy) by 2~4%. Detailed analysis is further performed to show the effectiveness of each component.

CLOct 21, 2019
Semantic Graph Convolutional Network for Implicit Discourse Relation Classification

Yingxue Zhang, Ping Jian, Fandong Meng et al.

Implicit discourse relation classification is of great importance for discourse parsing, but remains a challenging problem due to the absence of explicit discourse connectives communicating these relations. Modeling the semantic interactions between the two arguments of a relation has proven useful for detecting implicit discourse relations. However, most previous approaches model such semantic interactions from a shallow interactive level, which is inadequate on capturing enough semantic information. In this paper, we propose a novel and effective Semantic Graph Convolutional Network (SGCN) to enhance the modeling of inter-argument semantics on a deeper interaction level for implicit discourse relation classification. We first build an interaction graph over representations of the two arguments, and then automatically extract in-depth semantic interactive information through graph convolution. Experimental results on the English corpus PDTB and the Chinese corpus CDTB both demonstrate the superiority of our model to previous state-of-the-art systems.

CLFeb 27, 2019
Induction Networks for Few-Shot Text Classification

Ruiying Geng, Binhua Li, Yongbin Li et al.

Text classification tends to struggle when data is deficient or when it needs to adapt to unseen classes. In such challenging scenarios, recent studies have used meta-learning to simulate the few-shot task, in which new queries are compared to a small support set at the sample-wise level. However, this sample-wise comparison may be severely disturbed by the various expressions in the same class. Therefore, we should be able to learn a general representation of each class in the support set and then compare it to new queries. In this paper, we propose a novel Induction Network to learn such a generalized class-wise representation, by innovatively leveraging the dynamic routing algorithm in meta-learning. In this way, we find the model is able to induce and generalize better. We evaluate the proposed model on a well-studied sentiment classification dataset (English) and a real-world dialogue intent classification dataset (Chinese). Experiment results show that on both datasets, the proposed model significantly outperforms the existing state-of-the-art approaches, proving the effectiveness of class-wise generalization in few-shot text classification.