LGDec 21, 2024

Iterative Feature Exclusion Ranking for Deep Tabular Learning

Fathi Said Emhemed Shaninah, AbdulRahman M. A. Baraka, Mohd Halim Mohd Noor

arXiv:2412.16442v11 citationsh-index: 4Has CodeKnowl Inf Syst

Originality Incremental advance

AI Analysis

This addresses the issue of unidimensional feature selection in tabular data for machine learning practitioners, though it appears incremental as it builds on existing deep learning models.

The paper tackled the problem of feature importance ranking in deep tabular learning by proposing an iterative feature exclusion module, which improved performance over state-of-the-art methods on four public datasets.

Tabular data is a common format for storing information in rows and columns to represent data entries and their features. Although deep neural networks have become the main approach for modeling a wide range of domains including computer vision and NLP, many of them are not well-suited for tabular data. Recently, a few deep learning models have been proposed for deep tabular learning, featuring an internal feature selection mechanism with end-to-end gradient-based optimization. However, their feature selection mechanisms are unidimensional, and hence fail to account for the contextual dependence of feature importance, potentially overlooking crucial interactions that govern complex tasks. In addition, they overlook the bias of high-impact features and the risk associated with the limitations of attention generalization. To address this limitation, this study proposes a novel iterative feature exclusion module that enhances the feature importance ranking in tabular data. The proposed module iteratively excludes each feature from the input data and computes the attention scores, which represent the impact of the features on the prediction. By aggregating the attention scores from each iteration, the proposed module generates a refined representation of feature importance that captures both global and local interactions between features. The effectiveness of the proposed module is evaluated on four public datasets. The results demonstrate that the proposed module consistently outperforms state-of-the-art methods and baseline models in feature ranking and classification tasks. The code is publicly available at https://github.com/abaraka2020/Iterative-Feature-Exclusion-Ranking-Module and https://github.com/mohalim/IFENet

View on arXiv PDF Code

Similar