Feature Importance Ranking for Deep Learning
This addresses the problem of explainable AI for researchers and practitioners by providing a more effective method for feature importance ranking in deep learning, though it appears incremental as it builds on existing feature selection techniques.
The paper tackles the challenge of feature importance ranking in deep learning, which is difficult due to combinatorial optimization, by proposing a dual-net architecture with an operator and selector that jointly discover optimal feature subsets and rank importance, and it outperforms state-of-the-art methods on synthetic, benchmark, and real datasets.
Feature importance ranking has become a powerful tool for explainable AI. However, its nature of combinatorial optimization poses a great challenge for deep learning. In this paper, we propose a novel dual-net architecture consisting of operator and selector for discovery of an optimal feature subset of a fixed size and ranking the importance of those features in the optimal subset simultaneously. During learning, the operator is trained for a supervised learning task via optimal feature subset candidates generated by the selector that learns predicting the learning performance of the operator working on different optimal subset candidates. We develop an alternate learning algorithm that trains two nets jointly and incorporates a stochastic local search procedure into learning to address the combinatorial optimization challenge. In deployment, the selector generates an optimal feature subset and ranks feature importance, while the operator makes predictions based on the optimal subset for test data. A thorough evaluation on synthetic, benchmark and real data sets suggests that our approach outperforms several state-of-the-art feature importance ranking and supervised feature selection methods. (Our source code is available: https://github.com/maksym33/FeatureImportanceDL)