LG MEOct 31, 2022

Cost-aware Generalized $α$-investing for Multiple Hypothesis Testing

Thomas Cook, Harsh Vardhan Dubey, Ji Ah Lee, Guangyu Zhu, Tingting Zhao, Patrick Flaherty

arXiv:2210.17514v31.83 citationsh-index: 48Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of efficient resource allocation in sequential testing for domains like biology, though it is incremental as it builds on the generalized α-investing framework.

The paper tackles the problem of sequential multiple hypothesis testing with data collection costs, such as in biological experiments, by developing a cost-aware decision rule that optimizes expected α-wealth reward and sample size allocation, resulting in correctly rejecting more false null hypotheses than other methods for a sample size of n=1.

We consider the problem of sequential multiple hypothesis testing with nontrivial data collection costs. This problem appears, for example, when conducting biological experiments to identify differentially expressed genes of a disease process. This work builds on the generalized $α$-investing framework which enables control of the false discovery rate in a sequential testing setting. We make a theoretical analysis of the long term asymptotic behavior of $α$-wealth which motivates a consideration of sample size in the $α$-investing decision rule. Posing the testing process as a game with nature, we construct a decision rule that optimizes the expected $α$-wealth reward (ERO) and provides an optimal sample size for each test. Empirical results show that a cost-aware ERO decision rule correctly rejects more false null hypotheses than other methods for $n=1$ where $n$ is the sample size. When the sample size is not fixed cost-aware ERO uses a prior on the null hypothesis to adaptively allocate of the sample budget to each test. We extend cost-aware ERO investing to finite-horizon testing which enables the decision rule to allocate samples in a non-myopic manner. Finally, empirical tests on real data sets from biological experiments show that cost-aware ERO balances the allocation of samples to an individual test against the allocation of samples across multiple tests.

View on arXiv PDF Code

Similar