Power analysis of knockoff filters for correlated designs
This work addresses the theoretical gap in power analysis for knockoff filters in correlated designs, which is important for statisticians and data scientists, though it is incremental as it builds on existing frameworks.
The paper tackles the problem of analyzing the power of knockoff filters for variable selection with correlated predictors, introducing a functional called effective signal deficiency (ESD) that predicts consistency and showing that the precision matrix structure is key, with numerical evidence supporting the theory.
The knockoff filter introduced by Barber and Candès 2016 is an elegant framework for controlling the false discovery rate in variable selection. While empirical results indicate that this methodology is not too conservative, there is no conclusive theoretical result on its power. When the predictors are i.i.d. Gaussian, it is known that as the signal to noise ratio tend to infinity, the knockoff filter is consistent in the sense that one can make FDR go to 0 and power go to 1 simultaneously. In this work we study the case where the predictors have a general covariance matrix $Σ$. We introduce a simple functional called effective signal deficiency (ESD) of the covariance matrix $Σ$ that predicts consistency of various variable selection methods. In particular, ESD reveals that the structure of the precision matrix $Σ^{-1}$ plays a central role in consistency and therefore, so does the conditional independence structure of the predictors. To leverage this connection, we introduce Conditional Independence knockoff, a simple procedure that is able to compete with the more sophisticated knockoff filters and that is defined when the predictors obey a Gaussian tree graphical models (or when the graph is sufficiently sparse). Our theoretical results are supported by numerical evidence on synthetic data.