Variable selection with false discovery rate control in deep neural networks
This addresses the black-box nature of DNNs for researchers and practitioners needing reliable variable selection, though it is incremental as it builds on existing methods with added quality control.
The paper tackles the problem of variable selection in deep neural networks to improve interpretability, proposing SurvNet, a backward elimination procedure that controls false discovery rate and adaptively determines elimination steps, achieving this with applications to image and gene expression data.
Deep neural networks (DNNs) are famous for their high prediction accuracy, but they are also known for their black-box nature and poor interpretability. We consider the problem of variable selection, that is, selecting the input variables that have significant predictive power on the output, in DNNs. We propose a backward elimination procedure called SurvNet, which is based on a new measure of variable importance that applies to a wide variety of networks. More importantly, SurvNet is able to estimate and control the false discovery rate of selected variables, while no existing methods provide such a quality control. Further, SurvNet adaptively determines how many variables to eliminate at each step in order to maximize the selection efficiency. To study its validity, SurvNet is applied to image data and gene expression data, as well as various simulation datasets.