Graph-based Tabular Deep Learning Should Learn Feature Interactions, Not Just Make Predictions
This addresses the challenge of modeling complex feature interactions in tabular data for improving interpretability and trustworthiness in machine learning applications, though it is incremental as it builds on existing graph-based methods.
The paper argues that graph-based tabular deep learning methods should focus on learning feature interactions rather than just prediction, showing that enforcing true interaction structures improves predictive performance on synthetic datasets.
Despite recent progress, deep learning methods for tabular data still struggle to compete with traditional tree-based models. A key challenge lies in modeling complex, dataset-specific feature interactions that are central to tabular data. Graph-based tabular deep learning (GTDL) methods aim to address this by representing features and their interactions as graphs. However, existing methods predominantly optimize predictive accuracy, neglecting accurate modeling of the graph structure. This position paper argues that GTDL should move beyond prediction-centric objectives and prioritize the explicit learning and evaluation of feature interactions. Using synthetic datasets with known ground-truth graph structures, we show that existing GTDL methods fail to recover meaningful feature interactions. Moreover, enforcing the true interaction structure improves predictive performance. This highlights the need for GTDL methods to prioritize quantitative evaluation and accurate structural learning. We call for a shift toward structure-aware modeling as a foundation for building GTDL systems that are not only accurate but also interpretable, trustworthy, and grounded in domain understanding.