CLSep 19, 2023

Enhancing Open-Domain Table Question Answering via Syntax- and Structure-aware Dense Retrieval

Nengzheng Jin, Dongfang Li, Junying Chen, Joanna Siebert, Qingcai Chen

arXiv:2309.10506v120.7125 citationsh-index: 42Has Code

Originality Incremental advance

AI Analysis

This work addresses retrieval challenges in table-based QA for users needing accurate data extraction, representing an incremental improvement over existing methods.

The paper tackles the problem of information loss in open-domain table question answering by proposing a syntax- and structure-aware dense retrieval method, achieving state-of-the-art results on the NQ-tables dataset and outperforming baselines on a new Text-to-SQL dataset.

Open-domain table question answering aims to provide answers to a question by retrieving and extracting information from a large collection of tables. Existing studies of open-domain table QA either directly adopt text retrieval methods or consider the table structure only in the encoding layer for table retrieval, which may cause syntactical and structural information loss during table scoring. To address this issue, we propose a syntax- and structure-aware retrieval method for the open-domain table QA task. It provides syntactical representations for the question and uses the structural header and value representations for the tables to avoid the loss of fine-grained syntactical and structural information. Then, a syntactical-to-structural aggregator is used to obtain the matching score between the question and a candidate table by mimicking the human retrieval process. Experimental results show that our method achieves the state-of-the-art on the NQ-tables dataset and overwhelms strong baselines on a newly curated open-domain Text-to-SQL dataset.

View on arXiv PDF Code

Similar