AICLLGSep 27, 2024

"Oh LLM, I'm Asking Thee, Please Give Me a Decision Tree": Zero-Shot Decision Tree Induction and Embedding with Large Language Models

arXiv:2409.18594v25 citationsh-index: 4Has Code
Originality Highly original
AI Analysis

This provides a knowledge-driven baseline for low-data tabular machine learning, enabling the use of LLMs' world knowledge in interpretable models.

The authors tackled the problem of generating interpretable decision trees without training data by using large language models (LLMs) to leverage prior knowledge, resulting in zero-shot decision trees that sometimes outperform data-driven trees on small tabular datasets and produce better embeddings on average.

Large language models (LLMs) provide powerful means to leverage prior knowledge for predictive modeling when data is limited. In this work, we demonstrate how LLMs can use their compressed world knowledge to generate intrinsically interpretable machine learning models, i.e., decision trees, without any training data. We find that these zero-shot decision trees can even surpass data-driven trees on some small-sized tabular datasets and that embeddings derived from these trees perform better than data-driven tree-based embeddings on average. Our decision tree induction and embedding approaches can therefore serve as new knowledge-driven baselines for data-driven machine learning methods in the low-data regime. Furthermore, they offer ways to harness the rich world knowledge within LLMs for tabular machine learning tasks. Our code and results are available at https://github.com/ml-lab-htw/llm-trees.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes