Estimating the number and effect sizes of non-null hypotheses
This provides a method for researchers to optimize experimental designs and guarantee discovery counts, particularly in domains like genomics, though it is incremental as it builds on existing multiple testing frameworks.
The paper tackles the problem of estimating the distribution of effect sizes in multiple testing to predict the number of discoveries in future experiments, showing that an inexpensive pilot experiment can achieve this with significantly fewer samples than full-scale experiments.
We study the problem of estimating the distribution of effect sizes (the mean of the test statistic under the alternate hypothesis) in a multiple testing setting. Knowing this distribution allows us to calculate the power (type II error) of any experimental design. We show that it is possible to estimate this distribution using an inexpensive pilot experiment, which takes significantly fewer samples than would be required by an experiment that identified the discoveries. Our estimator can be used to guarantee the number of discoveries that will be made using a given experimental design in a future experiment. We prove that this simple and computationally efficient estimator enjoys a number of favorable theoretical properties, and demonstrate its effectiveness on data from a gene knockout experiment on influenza inhibition in Drosophila.