Lower bounds in multiple testing: A framework based on derandomized proxies
This work addresses the lack of general lower bounds for FDR-FNR tradeoffs in multiple testing, providing a foundational tool for statisticians and researchers in fields like genomics and data science, though it is incremental as it builds on prior work on performance bounds.
The paper tackles the problem of deriving lower bounds on the combined false discovery rate (FDR) and false non-discovery rate (FNR) in multiple testing, introducing a general framework based on derandomization. It results in explicit bounds for various models, including those with dependence and non-Gaussian distributions, and shows through simulations that these bounds closely match the performance of the Benjamini-Hochberg algorithm.
The large bulk of work in multiple testing has focused on specifying procedures that control the false discovery rate (FDR), with relatively less attention being paid to the corresponding Type II error known as the false non-discovery rate (FNR). A line of more recent work in multiple testing has begun to investigate the tradeoffs between the FDR and FNR and to provide lower bounds on the performance of procedures that depend on the model structure. Lacking thus far, however, has been a general approach to obtaining lower bounds for a broad class of models. This paper introduces an analysis strategy based on derandomization, illustrated by applications to various concrete models. Our main result is meta-theorem that gives a general recipe for obtaining lower bounds on the combination of FDR and FNR. We illustrate this meta-theorem by deriving explicit bounds for several models, including instances with dependence, scale-transformed alternatives, and non-Gaussian-like distributions. We provide numerical simulations of some of these lower bounds, and show a close relation to the actual performance of the Benjamini-Hochberg (BH) algorithm.