LGMay 29, 2023

Trompt: Towards a Better Deep Neural Network for Tabular Data

arXiv:2305.18446v244 citations
Originality Incremental advance
AI Analysis

This addresses a key bottleneck in applying deep learning to tabular data in domains like finance and healthcare, though it is an incremental improvement over existing methods.

The paper tackles the problem of deep neural networks underperforming compared to tree-based models on tabular data by proposing Trompt, a novel architecture inspired by prompt learning, which outperforms state-of-the-art deep neural networks and achieves comparable performance to tree-based models on a benchmark.

Tabular data is arguably one of the most commonly used data structures in various practical domains, including finance, healthcare and e-commerce. The inherent heterogeneity allows tabular data to store rich information. However, based on a recently published tabular benchmark, we can see deep neural networks still fall behind tree-based models on tabular datasets. In this paper, we propose Trompt--which stands for Tabular Prompt--a novel architecture inspired by prompt learning of language models. The essence of prompt learning is to adjust a large pre-trained model through a set of prompts outside the model without directly modifying the model. Based on this idea, Trompt separates the learning strategy of tabular data into two parts. The first part, analogous to pre-trained models, focus on learning the intrinsic information of a table. The second part, analogous to prompts, focus on learning the variations among samples. Trompt is evaluated with the benchmark mentioned above. The experimental results demonstrate that Trompt outperforms state-of-the-art deep neural networks and is comparable to tree-based models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes