LGNov 20, 2023

Incorporating LLM Priors into Tabular Learners

arXiv:2311.11628v13 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses the problem of improving tabular data classification with LLMs for practitioners in data science, offering an incremental method that combines existing techniques.

The paper tackles the challenge of integrating Large Language Models (LLMs) with traditional tabular data classification by introducing strategies to use LLMs for ranking categorical variables and generating priors on correlations, resulting in enhanced performance in few-shot scenarios, with validation showing superior performance against baseline models.

We present a method to integrate Large Language Models (LLMs) and traditional tabular data classification techniques, addressing LLMs challenges like data serialization sensitivity and biases. We introduce two strategies utilizing LLMs for ranking categorical variables and generating priors on correlations between continuous variables and targets, enhancing performance in few-shot scenarios. We focus on Logistic Regression, introducing MonotonicLR that employs a non-linear monotonic function for mapping ordinals to cardinals while preserving LLM-determined orders. Validation against baseline models reveals the superior performance of our approach, especially in low-data scenarios, while remaining interpretable.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes