LG MLJun 4, 2021

Top-$k$ Regularization for Supervised Feature Selection

arXiv:2106.02197v11.6

Originality Incremental advance

AI Analysis

This addresses feature selection challenges in regression and classification tasks, offering a novel regularization method that is incremental in improving existing approaches.

The paper tackles the problem of reconciling representativeness and inter-correlations in feature selection by introducing top-k regularization, which improves selection of informative features and models nonlinear relationships, showing effectiveness and stability across various datasets.

Feature selection identifies subsets of informative features and reduces dimensions in the original feature space, helping provide insights into data generation or a variety of domain problems. Existing methods mainly depend on feature scoring functions or sparse regularizations; nonetheless, they have limited ability to reconcile the representativeness and inter-correlations of features. In this paper, we introduce a novel, simple yet effective regularization approach, named top-$k$ regularization, to supervised feature selection in regression and classification tasks. Structurally, the top-$k$ regularization induces a sub-architecture on the architecture of a learning model to boost its ability to select the most informative features and model complex nonlinear relationships simultaneously. Theoretically, we derive and mathematically prove a uniform approximation error bound for using this approach to approximate high-dimensional sparse functions. Extensive experiments on a wide variety of benchmarking datasets show that the top-$k$ regularization is effective and stable for supervised feature selection.

View on arXiv PDF

Similar