CLJun 28, 2022

Proton: Probing Schema Linking Information from Pre-trained Language Models for Text-to-SQL Parsing

arXiv:2206.14017v256 citationsh-index: 41Has Code
Originality Highly original
AI Analysis

This addresses the critical challenge of generalizing text-to-SQL parsers to new databases by improving schema linking, which is essential for database query systems, though it is an incremental advancement over existing graph-based methods.

The paper tackles the problem of schema linking for text-to-SQL parsing by proposing a framework that probes relational structures from pre-trained language models using Poincaré distance, which robustly captures semantic correspondences even with differing surface forms. The result is a state-of-the-art performance on three benchmarks, achieved without additional parameters in an unsupervised manner.

The importance of building text-to-SQL parsers which can be applied to new databases has long been acknowledged, and a critical step to achieve this goal is schema linking, i.e., properly recognizing mentions of unseen columns or tables when generating SQLs. In this work, we propose a novel framework to elicit relational structures from large-scale pre-trained language models (PLMs) via a probing procedure based on Poincaré distance metric, and use the induced relations to augment current graph-based parsers for better schema linking. Compared with commonly-used rule-based methods for schema linking, we found that probing relations can robustly capture semantic correspondences, even when surface forms of mentions and entities differ. Moreover, our probing procedure is entirely unsupervised and requires no additional parameters. Extensive experiments show that our framework sets new state-of-the-art performance on three benchmarks. We empirically verify that our probing procedure can indeed find desired relational structures through qualitative analysis. Our code can be found at https://github.com/AlibabaResearch/DAMO-ConvAI.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes