TabGNN: Multiplex Graph Neural Network for Tabular Data Prediction
This addresses a gap in industrial tabular data prediction by incorporating sample relations, though it appears incremental as it builds on existing GNN and tabular methods.
The authors tackled the problem of tabular data prediction by modeling sample relations, which existing methods ignore, and proposed TabGNN, a multiplex graph neural network framework that consistently improved performance on eleven datasets compared to the AutoFE solution.
Tabular data prediction (TDP) is one of the most popular industrial applications, and various methods have been designed to improve the prediction performance. However, existing works mainly focus on feature interactions and ignore sample relations, e.g., users with the same education level might have a similar ability to repay the debt. In this work, by explicitly and systematically modeling sample relations, we propose a novel framework TabGNN based on recently popular graph neural networks (GNN). Specifically, we firstly construct a multiplex graph to model the multifaceted sample relations, and then design a multiplex graph neural network to learn enhanced representation for each sample. To integrate TabGNN with the tabular solution in our company, we concatenate the learned embeddings and the original ones, which are then fed to prediction models inside the solution. Experiments on eleven TDP datasets from various domains, including classification and regression ones, show that TabGNN can consistently improve the performance compared to the tabular solution AutoFE in 4Paradigm.