CLDec 3, 2025

SQuARE: Structured Query & Adaptive Retrieval Engine For Tabular Formats

arXiv:2512.04292v11 citationsh-index: 4
Originality Highly original
AI Analysis

This addresses the challenge of reliable spreadsheet querying for users dealing with messy real-world data, representing a novel method for a known bottleneck rather than a foundational breakthrough.

The paper tackled the problem of accurate question answering over real spreadsheets with complex structures like multirow headers and merged cells, and presented SQuARE, a hybrid retrieval framework that consistently surpassed single-strategy baselines and ChatGPT-4o on both retrieval precision and end-to-end answer accuracy while maintaining predictable latency.

Accurate question answering over real spreadsheets remains difficult due to multirow headers, merged cells, and unit annotations that disrupt naive chunking, while rigid SQL views fail on files lacking consistent schemas. We present SQuARE, a hybrid retrieval framework with sheet-level, complexity-aware routing. It computes a continuous score based on header depth and merge density, then routes queries either through structure-preserving chunk retrieval or SQL over an automatically constructed relational representation. A lightweight agent supervises retrieval, refinement, or combination of results across both paths when confidence is low. This design maintains header hierarchies, time labels, and units, ensuring that returned values are faithful to the original cells and straightforward to verify. Evaluated on multi-header corporate balance sheets, a heavily merged World Bank workbook, and diverse public datasets, SQuARE consistently surpasses single-strategy baselines and ChatGPT-4o on both retrieval precision and end-to-end answer accuracy while keeping latency predictable. By decoupling retrieval from model choice, the system is compatible with emerging tabular foundation models and offers a practical bridge toward a more robust table understanding.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes