CLMay 29, 2025

Table-R1: Inference-Time Scaling for Table Reasoning

arXiv:2505.23621v218 citationsh-index: 28EMNLP
Originality Incremental advance
AI Analysis

This addresses the problem of efficient table reasoning for AI applications, though it is incremental as it builds on existing post-training strategies.

The study tackled table reasoning tasks by developing inference-time scaling methods, resulting in a 7B-parameter model that matches or exceeds the performance of GPT-4.1 and DeepSeek-R1 on diverse tasks.

In this work, we present the first study to explore inference-time scaling on table reasoning tasks. We develop and evaluate two post-training strategies to enable inference-time scaling: distillation from frontier model reasoning traces and reinforcement learning with verifiable rewards (RLVR). For distillation, we introduce a large-scale dataset of reasoning traces generated by DeepSeek-R1, which we use to fine-tune LLMs into the Table-R1-SFT model. For RLVR, we propose task-specific verifiable reward functions and apply the GRPO algorithm to obtain the Table-R1-Zero model. We evaluate our Table-R1-series models across diverse table reasoning tasks, including short-form QA, fact verification, and free-form QA. Notably, the Table-R1-Zero model matches or exceeds the performance of GPT-4.1 and DeepSeek-R1, while using only a 7B-parameter LLM. It also demonstrates strong generalization to out-of-domain datasets. Extensive ablation and qualitative analyses reveal the benefits of instruction tuning, model architecture choices, and cross-task generalization, as well as emergence of essential table reasoning skills during RL training.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes