ML LGFeb 17, 2020

Estimating the number and effect sizes of non-null hypotheses

Jennifer Brennan, Ramya Korlakai Vinayak, Kevin Jamieson

arXiv:2002.07297v25.83 citationsHas Code

Originality Incremental advance

AI Analysis

This provides a method for researchers to optimize experimental designs and guarantee discovery counts, particularly in domains like genomics, though it is incremental as it builds on existing multiple testing frameworks.

The paper tackles the problem of estimating the distribution of effect sizes in multiple testing to predict the number of discoveries in future experiments, showing that an inexpensive pilot experiment can achieve this with significantly fewer samples than full-scale experiments.

We study the problem of estimating the distribution of effect sizes (the mean of the test statistic under the alternate hypothesis) in a multiple testing setting. Knowing this distribution allows us to calculate the power (type II error) of any experimental design. We show that it is possible to estimate this distribution using an inexpensive pilot experiment, which takes significantly fewer samples than would be required by an experiment that identified the discoveries. Our estimator can be used to guarantee the number of discoveries that will be made using a given experimental design in a future experiment. We prove that this simple and computationally efficient estimator enjoys a number of favorable theoretical properties, and demonstrate its effectiveness on data from a gene knockout experiment on influenza inhibition in Drosophila.

View on arXiv PDF Code

Similar