OpenTab: Advancing Large Language Models as Open-domain Table Reasoners
This addresses the challenge of enabling LLMs to reason over open-domain table data, which is an incremental improvement over existing retrieval-based methods.
The paper tackled the problem of large language models struggling with structured table data by proposing OpenTab, a framework that retrieves relevant tables and uses SQL programs for efficient parsing, achieving up to 21.5% higher accuracy in open- and closed-domain settings.
Large Language Models (LLMs) trained on large volumes of data excel at various natural language tasks, but they cannot handle tasks requiring knowledge that has not been trained on previously. One solution is to use a retriever that fetches relevant information to expand LLM's knowledge scope. However, existing textual-oriented retrieval-based LLMs are not ideal on structured table data due to diversified data modalities and large table sizes. In this work, we propose OpenTab, an open-domain table reasoning framework powered by LLMs. Overall, OpenTab leverages table retriever to fetch relevant tables and then generates SQL programs to parse the retrieved tables efficiently. Utilizing the intermediate data derived from the SQL executions, it conducts grounded inference to produce accurate response. Extensive experimental evaluation shows that OpenTab significantly outperforms baselines in both open- and closed-domain settings, achieving up to 21.5% higher accuracy. We further run ablation studies to validate the efficacy of our proposed designs of the system.