LGAIFeb 12

TabSieve: Explicit In-Table Evidence Selection for Tabular Prediction

arXiv:2602.11700v11 citationsh-index: 3
Originality Highly original
AI Analysis

This addresses the challenge of robust and interpretable few-shot learning in tabular data for machine learning practitioners, representing an incremental improvement with a novel method for a known bottleneck.

The paper tackles the problem of inconsistent evidence usage and noise sensitivity in tabular prediction by proposing TabSieve, a select-then-predict framework that explicitly selects informative rows as evidence, resulting in average performance gains of 2.92% on classification and 4.45% on regression over baselines.

Tabular prediction can benefit from in-table rows as few-shot evidence, yet existing tabular models typically perform instance-wise inference and LLM-based prompting is often brittle. Models do not consistently leverage relevant rows, and noisy context can degrade performance. To address this challenge, we propose TabSieve, a select-then-predict framework that makes evidence usage explicit and auditable. Given a table and a query row, TabSieve first selects a small set of informative rows as evidence and then predicts the missing target conditioned on the selected evidence. To enable this capability, we construct TabSieve-SFT-40K by synthesizing high-quality reasoning trajectories from 331 real tables using a strong teacher model with strict filtering. Furthermore, we introduce TAB-GRPO, a reinforcement learning recipe that jointly optimizes evidence selection and prediction correctness with separate rewards, and stabilizes mixed regression and classification training via dynamic task-advantage balancing. Experiments on a held-out benchmark of 75 classification and 52 regression tables show that TabSieve consistently improves performance across shot budgets, with average gains of 2.92% on classification and 4.45% on regression over the second-best baseline. Further analysis indicates that TabSieve concentrates more attention on the selected evidence, which improves robustness to noisy context.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes