LGFeb 22, 2024

OpenTab: Advancing Large Language Models as Open-domain Table Reasoners

Kezhi Kong, Jiani Zhang, Zhengyuan Shen, Balasubramaniam Srinivasan, Chuan Lei, Christos Faloutsos, Huzefa Rangwala, George Karypis

arXiv:2402.14361v225.443 citationsh-index: 99Has CodeICLR

Originality Incremental advance

AI Analysis

This addresses the challenge of enabling LLMs to reason over open-domain table data, which is an incremental improvement over existing retrieval-based methods.

The paper tackled the problem of large language models struggling with structured table data by proposing OpenTab, a framework that retrieves relevant tables and uses SQL programs for efficient parsing, achieving up to 21.5% higher accuracy in open- and closed-domain settings.

Large Language Models (LLMs) trained on large volumes of data excel at various natural language tasks, but they cannot handle tasks requiring knowledge that has not been trained on previously. One solution is to use a retriever that fetches relevant information to expand LLM's knowledge scope. However, existing textual-oriented retrieval-based LLMs are not ideal on structured table data due to diversified data modalities and large table sizes. In this work, we propose OpenTab, an open-domain table reasoning framework powered by LLMs. Overall, OpenTab leverages table retriever to fetch relevant tables and then generates SQL programs to parse the retrieved tables efficiently. Utilizing the intermediate data derived from the SQL executions, it conducts grounded inference to produce accurate response. Extensive experimental evaluation shows that OpenTab significantly outperforms baselines in both open- and closed-domain settings, achieving up to 21.5% higher accuracy. We further run ablation studies to validate the efficacy of our proposed designs of the system.

View on arXiv PDF Code

Similar