ML LGSep 15, 2022

Error Controlled Feature Selection for Ultrahigh Dimensional and Highly Correlated Feature Space Using Deep Learning

Arkaprabha Ganguli, David Todem, Tapabrata Maiti

arXiv:2209.07011v3h-index: 26

Originality Incremental advance

AI Analysis

This addresses the need for interpretable and robust deep learning models in applications with complex, high-dimensional data, representing an incremental improvement over existing feature-selected deep learning methods.

The authors tackled the problem of feature selection in ultrahigh-dimensional and highly correlated feature spaces with high noise levels, proposing a novel screening and cleaning method using deep learning that achieves high power while minimizing the false discovery rate, as demonstrated in simulations and real datasets.

In recent years, deep learning has been at the center of analytics due to its impressive empirical success in analyzing complex data objects. Despite this success, most of the existing tools behave like black-box machines, thus the increasing interest in interpretable, reliable, and robust deep learning models applicable to a broad class of applications. Feature-selected deep learning has emerged as a promising tool in this realm. However, the recent developments do not accommodate ultra-high dimensional and highly correlated features, in addition to the high noise level. In this article, we propose a novel screening and cleaning method with the aid of deep learning for a data-adaptive multi-resolutional discovery of highly correlated predictors with a controlled error rate. Extensive empirical evaluations over a wide range of simulated scenarios and several real datasets demonstrate the effectiveness of the proposed method in achieving high power while keeping the false discovery rate at a minimum.

View on arXiv PDF

Similar