GRIP2: A Robust and Powerful Deep Knockoff Method for Feature Selection
This addresses the problem of false discovery control in feature selection for researchers and practitioners in machine learning, particularly in complex data settings, though it is an incremental improvement over existing deep learning methods.
The paper tackled the challenge of feature selection in nonlinear, high-correlation, and low signal-to-noise regimes by proposing GRIP2, a deep knockoff method that integrates feature activity over a regularization surface, resulting in improved robustness and power, with better performance than linear baselines on real-world HIV drug resistance data.
Identifying truly predictive covariates while strictly controlling false discoveries remains a fundamental challenge in nonlinear, highly correlated, and low signal-to-noise regimes, where deep learning based feature selection methods are most attractive. We propose Group Regularization Importance Persistence in 2 Dimensions (GRIP2), a deep knockoff feature importance statistic that integrates first-layer feature activity over a two-dimensional regularization surface controlling both sparsity strength and sparsification geometry. To approximate this surface integral in a single training run, we introduce efficient block-stochastic sampling, which aggregates feature activity magnitudes across diverse regularization regimes along the optimization trajectory. The resulting statistics are antisymmetric by construction, ensuring finite-sample FDR control. In extensive experiments on synthetic and semi-real data, GRIP2 demonstrates improved robustness to feature correlation and noise level: in high correlation and low signal-to-noise ratio regimes where standard deep learning based feature selectors may struggle, our method retains high power and stability. Finally, on real-world HIV drug resistance data, GRIP2 recovers known resistance-associated mutations with power better than established linear baselines, confirming its reliability in practice.