LGMay 19, 2025

Make Still Further Progress: Chain of Thoughts for Tabular Data Leaderboard

arXiv:2505.13421v16 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses the problem of achieving top performance in tabular data tasks for practitioners, though it is incremental as it builds on existing ensemble and LLM methods.

The paper tackles the variability in tabular model performance by proposing an in-context ensemble framework that uses large language models for dynamic, instance-specific integration of external predictions, showing it outperforms baselines and standard ensembles across datasets.

Tabular data, a fundamental data format in machine learning, is predominantly utilized in competitions and real-world applications. The performance of tabular models--such as gradient boosted decision trees and neural networks--can vary significantly across datasets due to differences in feature distributions and task characteristics. Achieving top performance on each dataset often requires specialized expert knowledge. To address this variability, practitioners often aggregate the predictions of multiple models. However, conventional aggregation strategies typically rely on static combination rules and lack instance-level adaptability. In this work, we propose an in-context ensemble framework for tabular prediction that leverages large language models (LLMs) to perform dynamic, instance-specific integration of external model predictions. Without access to raw tabular features or semantic information, our method constructs a context around each test instance using its nearest neighbors and the predictions from a pool of external models. Within this enriched context, we introduce Chain of Tabular Thoughts (CoT$^2$), a prompting strategy that guides LLMs through multi-step, interpretable reasoning, making still further progress toward expert-level decision-making. Experimental results show that our method outperforms well-tuned baselines and standard ensemble techniques across a wide range of tabular datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes