TabKANet: Tabular Data Modeling with Kolmogorov-Arnold Network and Transformer
This addresses the challenge of tabular data modeling for machine learning practitioners, offering an incremental improvement by integrating existing techniques in a novel way.
The paper tackles the problem of learning from tabular data by proposing TabKANet, a model that combines Kolmogorov-Arnold Networks and Transformers to unify numerical and categorical feature encoding, resulting in stable and significantly superior performance compared to Neural Networks and competitive results against Gradient Boosted Decision Trees across multiple public datasets.
Tabular data is the most common type of data in real-life scenarios. In this study, we propose the TabKANet model for tabular data modeling, which targets the bottlenecks in learning from numerical content. We constructed a Kolmogorov-Arnold Network (KAN) based Numerical Embedding Module and unified numerical and categorical features encoding within a Transformer architecture. TabKANet has demonstrated stable and significantly superior performance compared to Neural Networks (NNs) across multiple public datasets in binary classification, multi-class classification, and regression tasks. Its performance is comparable to or surpasses that of Gradient Boosted Decision Tree models (GBDTs). Our code is publicly available on GitHub: https://github.com/AI-thpremed/TabKANet.