CLMay 29

Semantic Triplet Restoration: A Novel Protocol for Hierarchical Table Understanding in Large Language Models

arXiv:2605.3155085.9Has Code
Predicted impact top 35% in CL · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the problem of efficiently representing hierarchical table structures for large language models, particularly benefiting models with constrained inference budgets.

This paper introduces Semantic Triplet Restoration (STR), a protocol that converts table cells into atomic facts to improve hierarchical table understanding for large language models. STR matches or improves upon HTML-based baselines across four Chinese and English table-QA benchmarks while reducing input tokens, with greater benefits for smaller language models and longer table contexts.

Table question answering requires models to recover semantic relations encoded implicitly by two-dimensional layout, merged cells, and hierarchical headers. Current pipelines typically use HTML or Markdown as intermediate table representations, but these layout-oriented serializations introduce markup overhead and require large language models to infer header-cell alignments from row and column spans. We propose Semantic Triplet Restoration (STR), a protocol that rewrites each cell as an atomic fact <item path, feature path, value>, where the item path specifies the row-wise entity, the feature path specifies the hierarchical attribute, and the value contains the cell content. We also present TripletQL, a lightweight query-aware router that uses STR to select an appropriate rendering or filtered subset of triplets for each question. Across four Chinese and English table-QA benchmarks, STR matches or improves upon HTML-based baselines while reducing input tokens. The relative benefit grows for smaller language models and longer table contexts, suggesting that explicit semantic representations are especially useful under constrained inference budgets. Code and data are available at https://github.com/Phoenix-ni/STR.git .

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes